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(57) Abstract 



The invention relates to Streptococcus suis infections of pigs, to vaccines directed against those infections and to tests for diagnosing 
Streptococcus suis infections. The invention provides an isolated or recombinant nucleic acid encoding a capsular gene cluster of 
Streptococcus suis or a gene or gene fragment derivated thereof. The invention furthermore provides a nucleic acis probe or primer 
allowing species or serotype specific detection of Streptococcus suis. The invention also provides a Streptococcus suis antigen and vaccine 
derived thereof. 
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(57) Abstract 



The invention relates to Streptococcus suis infections of pigs, to vaccines directed against those infections and to tests for diagnosing 
Streptococcus suis infections. The invention provides an isolated or recombinant nucleic acid encoding a capsular gene cluster of 
Streptococcus suis or a gene or gene fragment derivated thereof. The invention furthermore provides a nucleic acis probe or primer 
allowing species or serotype specific detection of Streptococcus suis. The invention also provides a Streptococcus suis antigen and vaccine 
derived thereof. 
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WO 00/05378 PCT/NL99/00460 

Title: Streptococcus suis vaccines and diagnostic tests. 



The invention relates to Streptococcus infections of 
pigs, to vaccines directed against those infections, to tests 
for diagnosing Streptococcus infections and to the field of 
bacterial vaccines, more in particular to vaccines directed 

5 against Streptococcus infections. 

Streptococcus species, of which there are a large variety 
causing infections in domestic animals and man, are often 
grouped according to Lancef ield' s groups. Typing according to 
Lancefield occurs on the basis of serological determinants or 

10 antigens that are among others present in the capsule of the 
bacterium and allows for only an approximate determination, 
often bacteria from a different group show cross-reactivity 
with each other, while other Streptococci can not be assigned 
a group-determinant at all. Within groups, further 

15 differentiation is often possible on the basis of serotyping; 
these serotypes further contribute to the large antigenic 
variability of Streptococci, a fact that creates an array of 
difficulties within diagnosis of and vaccination against 
Streptococcal infections . 

20 Lancefield group A Streptococcus species (GAS, 

Streptococcus pyogenes) , are common with children, causing 
nasopharyngeal infections and complications thereof. Among 
animals, especially cattle are susceptible to GAS, whereby 
often* mastitis is found. 

25 Group A streptococci are the etiologic agents of 

streptococcal pharyngitis and impetigo, two of the commonest 
bacterial infections in children, as well as a variety of less 
common but potentially life-threatening infections, including 
soft tissue infections, bacteraemia, and pneumonia. In 

30 addition, GAS are uniquely associated with the postinfectious 
autoimmune syndromes of acute rheumatic fever and 
poststreptococcal glomerulonephritis . 

Several recent reports suggest that the incidence both of 
serious infections due to GAS and of acute rheumatic fever has 



t 
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increased during the past decade, focusing renewed interest on 
defining the attributes or virulence factors of the organism 
that may play a role in the pathogenesis of these diseases. 

GAS produce several surface components and extracellular 
5 products that may be important in virulence. The major surface 
protein, M protein, has been studied in the most detail and 
has been shown convincingly to play a role in both virulence 
and immunity. Isolates rich in M protein are able to grow in 
human blood, a property thought to reflect the capacity of M 

10 protein to interfere with phagocytosis, and these isolates 
tend to be virulent in experimental animals. 

Lancefield group B Streptococcus (GBS) are most often 
seen with cattle, causing mastitis, however, human infants are 
susceptible as well, often with fatal consequences. Group B 

15 streptococci (GBS) constitute a major cause of bacterial 

sepsis and meningitis among human neonates born in the United 
States and Western Europe and are emerging as significant 
neonatal pathogens in developing countries as well. 

It is estimated that GBS strains are responsible for 

20 10,000 to 15,000 cases of invasive infection in neonates in 
the United States alone. Despite advances in early diagnosis 
and treatment, neonatal sepsis due to GBS continues to carry a 
mortality rate of 15 to 20%. In addition, survivors of GBS 
meningitis have 30 to 50% incidence of long-term neurologic 

25 sequelae. The increasing recognition over the past two decades 
of GBS as an important pathogen for human infants has 
generated renewed interest in defining the bacterial and host 
factors important in virulence of GBS and in the immune 
response to GBS infection. 

30 Particular attention has focused on the capsular 

polysaccharide as the predominant surface antigen of the 
organisms. In a modification of the system originally 
developed by Rebecca Lancefield, GBS strains are serotyped on 
the basis of antigen differences in their capsular 

35 polysaccharides and the presence or absence of serologically 
defined C proteins. While GBS isolated from non-human sources 
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often lack a serologically detectable capsule, a large 
majority of strains associated with neonatal infection belong 
to one of four major capsular serotypes, la, lb, II or III. 
The capsular polysaccharide forms the outermost layer around 
5 the exterior of the bacterial cell, superficial to the cell 
wall. The capsule is" distinct from the cell wall-associated 
group B carbohydrate. It has been suggested that the presence 
of sialic acid in the capsule of bacteria that cause 
meningitis is important for these bacteria to breach the 
10 blood-brain barrier. Indeed, in S. agalactiae sialic acid has 
shown to be critical for the virulence function of the type 
III capsule. The capsule of S. suis serotype is composed of 
glucose, galactose, N-acetylglucosamine, rhamnose and sialic 
acid. 

15 The group B polysaccharide, in contrast to the type- 

specific capsule, is present on all GBS strains and is the 
basis for serogrouping of the organisms into Lancef ield 1 s 
group B. Early studies by Lancefield and co-workers showed 
that antibodies raised in rabbits against whole GBS organisms 

20 protected mice against challenge with strains of homologous 
capsular type, demonstrating the central role of the capsular 
polysaccharide as a protective antigen. Studies in the 1970s 
by Baker and Kasper demonstrated that cord blood of human 
infants with type III GBS sepsis uniformly had low or 

25 undetectable levels of antibodies directed against the type 
III capsule, suggesting that a deficiency of anticapsular 
antibody was a key factor in susceptibility of human neonates 
to GBS disease. 

Lancefield group C infections, such as those with S. 

30 egui, S. zooepidemicus , S. dysgalactiae, and others are mainly 
seen with horse, cattle and pigs, but can also cross the 
species barrier to humans. Lancefield group D (S. bovis) 
infections are found with all mammals and some birds, 
sometimes resulting in endocarditis or septicaemia. 

35 Lancefield groups E, G, L, P, U and V (S. porcinus, S, 

canis, S. dysgalactiae) are found with various hosts, causing 
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neonatal infections, nasopharyngeal infections or mastitis. 

Within Lancefield groups R, S, and T, (and with ungrouped 
types) S. suis is found, an important cause of meningitis, 
septicemia, arthritis and sudden death in young pigs. 
5 Incidentally, it can also cause meningitis in man. 

Streptococcus suis is an important cause of meningitis, 
septicemia, arthritis and sudden death in young pigs (4, 46) . 
Incidentally, it can also cause meningitis in man (1). S.suis 
strains are usually identified and classified by their 

10 morphological, biochemical and serological characteristics (58, 
59, 46) . Serological classification is based on the presence of 
specific antigenic polysaccharides. So far, 35 different 
serotypes have been described (9, 56, 14). In several European 
countries, S. suis serotype 2 is the most prevalent type 

15 isolated from diseased pigs, followed by serotypes 9 and 1. 
Serological typing of S. suis is carried out using different 
types of agglutination tests. In these tests, isolated and 
biochemically characterised S. suis cells are agglutinated with 
a panel of 35 specific sera. These methods are very laborious 

20 and time-consuming. 

Little is known about the pathogenesis of the disease caused 
by S. suis, let alone about its various serotypes such as type 
2. Various bacterial components, such as extracellular and 
cell-membrane associated proteins, fimbriae, haemaglutinins, 

25 and haemolysin have been suggested as virulence factors (9, 10, 
11, 15, 16, 47, 49) . However, the precise role of these protein 
components in the pathogenesis of the disease remains unclear 
(37) . It is well known that the polysaccharidic capsule of 
various Streptococci and other gram-positive bacteria plays an 

30 important role in pathogenesis (3, 6, 35, 51, 52) . The capsule 
enables these micro-organisms to resist phagocytosis and is 
therefore regarded as an important virulence factor. Recently, 
a role of the capsule of S. suis in the pathogenesis was 
suggested as well (5) . However, the structure, organisation and 

35 functioning of the genes responsible for capsule polysaccharide 
synthesis (cps) in S. suis is unknown. Within S. suis serotypes 
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1 and 2 strains can differ in virulence for pigs (41, 45, 49) . 
Some type 1 and 2 strains are virulent, other strains are not. 
Because both virulent and non-virulent strains of serotype 1 
and 2 strains are fully encapsulated, it may even be that 

5 capsule is not a relevant factor required for virulence. 

Attempts to control S. suis infections or disease are 
still hampered by the lack of knowledge about the epidemiology 
of the disease and the lack of effective vaccines and 
sensitive diagnostics. It is well known and generally accepted 

10 that the polysaccharidic capsule of various Streptococci and 
other gram-positive bacteria plays an important role in 
pathogenesis. The capsule enables these micro-organisms to 
resist phagocytosis and is therefore regarded as an important 
virulence factor. 

15 Compared to encapsulated S. suis strains, non- 

encapsulated S. suis strains are phagocytosed by murine 
polymorphonuclear leucocytes to a greater degree. Moreover, an 
increase in thickness of capsule was noted for in vivo grown 
virulent strains while no increase was observed for avirulent 

20 strains. Therefor, these data again demonstrate the role of 
the capsule in the pathogenesis for S. suis as well. 

Ungrouped Streptoccus species, such as S. mutans, causing 
carries with humans, S, uberis, causing mastitis with cattle, 
and S. pneumonia , causing major infections in humans, and 

25 Enterococcus faecilalis and E. faecium, further contributed to 
the large group of Streptococci. 

Streptococcus pneumoniae (the pneumococcus) is a human 
pathogen causing invasive diseases, such as pneumonia, 
bacteraemia, and meningitis. Despite the availability of 

30 antibiotics, pneumococcal infections remain common and can 

still be fatal, especially in high-risk groups, such as young 
children and elderly people. Particularly in developing 
countries, many children under the age of five years die each 
year from pneumococcal pneumonia. S. pneumoniae is also the 

35 leading cause of otitis media and sinusitis. These infections 
are less serious, but nevertheless incur substantial medical 
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costs, especially when leading to complications, such as 
permanent deafness. The normal ecological niche of the 
pneumococcus is the nasopharynx of man. The entire human 
population is colonised by the pneumococcus at one time or 
5 another, and at a given time, up to 60% of individuals may be 
carriers. Nasopharyngeal carriage of pneumococci by man is 
often accompanied by the development of protection to 
, infection by the same serotype. Most infections do not occur 
after prolonged carriage but follow the acquisition of 

10 recently acquired strains. Many bacteria . contain surface 

polysaccharides which act as a protective layer against the 
environment. Surface polysaccharides of pathogenic bacteria 
usually make the bacteria resistant to the defense mechanisms 
of the host, e.g., the lytic action of serum or phagocytosis. 

15 In this respect, the serotype-specif ic capsular polysaccharide 
(CP) of Streptococcus pneumoniae, is an important virulence 
factor. Unencapsulated strains are avirulent, and antibodies 
directed against the CP are protective. Protection is serotype 
specific; each serotype has its own, specific CP structure. 

20 Ninety different capsular serotypes have been identified. 
Currently, CPs of 23 serotypes are included in a vaccine. 

Vaccines directed against Streptococcus infections in 
general aim at utilising an immune response directed against 
the polysaccharide capsule of the various Streptococcus 

25 species, especially since the capsule is considered a main 
virulence factor for these bacteria. The capsule, during 
infection, provides resistance to phagocytosis and thus 
promotes the escape of the bacteria from the immune system of 
the host, protecting the bacteria by elimination by 

30 macrophages and neutrophils. 

The capsule particularly confers the bacterium resistance 
to complement-mediated opsonophagocytosis . In addition, some 
bacteria express capsular polysaccharides (CPs) that mimic 
host molecules, thereby avoiding the immune system of the 

35 host. Also, even when the bacteria have been phagocytosed, 
intracellular killing is hampered by the presence of a 
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capsule . 

It is in general thought that only when the host has 
antibodies or other serum-factors directed against capsule 
antigens, the bacterium will get recognised by the immune 
5 system through the anticapsular-antibodies or serum-factors 
bound to its capsule, and will, through opsonisation, get 
phagocytosed and killed. 

However, these antibodies are serotype-specif ic, and will 
often only confer protection against only one of the many 

10 serotypes known within a group of Streptococci . 

For example, current commercially available S. suis 
vaccines, which are in general based on whole-cell-bacterial 
preparations, or on capsule-enriched fractions of S. suis, 
confer only limited protection against heterologous strains. 

15 Also, the current pneumococcal vaccine, licensed in the United 
States in 1983, consists of purified CPs of 23 pneumococcal 
serotypes whereas at least 90 CP types exist. 

The composition of this pneumococcal vaccine was based on 
the frequency of the occurrence of disease isolates in the US 

20 and cross-reactivity between various serotypes. Although this 
vaccine protects healthy adults against infections caused by 
serotypes included in the vaccine, it fails to raise a 
protective immune response in infants younger than 18 months 
and it is less effective in elderly people. In addition, the 

25 vaccine confers only limited protection in patients with 
immunodeficiencies and haematology malignancies. 
In the light of above, improved vaccines are needed against 
Streptococcus infections. Much attention is being paid at 
producing CP vaccines by producing the relevant polysaccharides 

30 via chemical or recombinant means. However, chemical synthesis 
of polysaccharides is costly, and capsular polysaccharide 
synthesis by recombinant means necessitates knowledge about the 
relevant genes, which are not always available and need to de 
determined for each and every relevant serotype. 

35 
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The invention provides an isolated or recombinant nucleic 
acid encoding a capsular (cps) gene cluster of Streptococcus 
suis. Biosynthesis of capsule polysaccharides in general has 
been studied in a number of Gram-positive and Gram-negative 
5"* bacteria (32) . In Gram-negative bacteria, but also in a number 
of gram-positive bacteria, genes which are involved in the 
biosynthesis of polysaccharides are clustered at a single 
locus. Streptococcus suis capsular genes as provided by the 
invention show a common genetic organisation involving three 

10 distinct regions. The central region is serotype specific and 
encodes enzymes responsible for the synthesis and 
polymerisation of the polysaccharides. This region is flanked 
by two regions conserved in Streptococcus suis which encode 
proteins for common functions such as transport of the 

15 polysaccharide across the cellular membrane. However, in 
between species, only low homologies exist, hampering easy 
comparison and detection of seemingly similar genes. Knowing 
the nucleic acid encoding the flanking regions allows type- 
specific determination of nucleic acid of the central region of 

20 Streptococcus suis serotypes, as for example described in the 
experimental part of the description of the invention. 

The invention provides an isolated or recombinant nucleic 
acid encoding a capsular gene cluster of Streptococcus suis or 
a gene or gene fragment derived thereof. Such a nucleic acid 

25 is for example provided by hybridising chromosomal DNA derived 
from any one of the Streptococcus suis serotypes to a nucleic 
acid encoding a gene derived from a Streptococcus suis 
serotype 1, 2 or 9 capsular gene cluster, as provided by the 
invention (see for example Tables 4 and 5) and cloning of 

30 (type-specific) genes as for example described in the 

experimental part of the description. At least 14 open reading 
frames are identified. Most of the genes belong to a single 
transcriptional unit, identifying a co-ordinate control of 
these genes, they, and the enzymes and proteins they encode, 

35 act in concert to provide the capsule with the relevant 

polysaccharides. The invention provides cps genes and proteins 
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encoded thereof involved in regulation (CpsA) , chain length 
determination (CpsB, C) , export (CpsC) and biosynthesis (CpsE, 
F, G, H, J, K) . Although the overall organisation seemed at 
first glance to be similar to that of the cps and eps gene 

5 clusters of a number of Gram-positive bacteria (19, 32, 42), 
overall homologies are low (see table 3). The region involved 
in biosynthesis is located at the centre of the gene cluster 
and is flanked by two regions containing genes with more 
common functions. 

10 The invention provides an isolated or recombinant nucleic 
acid encoding a capsular gene cluster of Streptococcus suis 
serotype 2 or a gene or gene fragment derived thereof, 
preferably as identified in Figure 3. Genes in this gene 
cluster are involved in polysaccharide biosynthesis of 

15 capsular components and antigens. For a further description of 
such genes see for example Table 2 of the description, for 
example a cpsA gene is provided functionally encoding 
regulation of capsular polysaccharide synthesis, whereas cpsB 
and cpsC are functionally involved in chain in chain length 

20 determination. Other genes, such as cpsD, E, F, G, H, I, J, K 
and related genes, are involved in polysaccharide syntheses, 
functioning for example as glucosyl- or glycosyltransf erase . 
The cpsF, G, H, I, J genes encode more type-specific proteins 
than the flanking genes which are found more-or-less conserved 

25 throughout the species and can serve as base for selection of 
primers or probes in PCR-amplif ication or cross-hybridisation 
experiments for subsequent cloning. 

For example, the invention further provides an isolated or 
30 recombinant nucleic acid encoding a capsular gene cluster of 
Streptococcus suis serotype 1 or a gene or gene fragment 
derived thereof, preferably as identified in Figure 4. 

In addition, the invention provides an isolated or 
recombinant nucleic acid encoding a capsular gene cluster of 
35 Streptococcus suis serotype 9 or a gene or gene fragment 
derived thereof, preferably as identified in Figure 5. 
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Furthermore, the invention provides for example a fragment or 

parts thereof of the cps locus, involved in the capsular 

polysaccharide biosynthesis, of S. suis, exemplified in the 

experimental part for serotype 1, 2 or 9, and allows easy 

5 identification or detection of related fragments derived of 

other serotype of S. suis. 

The invention provides a nucleic acid probe or primer 

derived from a nucleic acid according to the invention 

allowing species or serotype specific detection of 

10 Streptococcus suis. Such a probe or primer (herein used 

interchangeably) is for example a DNA, RNA or PNA (peptide 
nucleic acid) probe hybridising with capsular nucleic acid as 
provided by the invention. Species specific detection is 
provided preferably by selecting a probe or primer sequence 

15 from a species-specific region (e.g. flanking region) whereas 
serotype specific detection is provided preferably by 
selecting a probe or primer sequence from a type-specific 
region (e.g. central region) of a capsular gene cluster as 
provided by the invention. Such a probe or primer can be used 

20 in a -further unmodified form, for example in cross- 
hybridisation or polymerase-chain reaction (PCR) experiments 
as for example described in the experimental part of the 
description of the invention. Herein the invention provides 
the isolation and molecular characterisation of additional 

25 type specific cps genes of S. suis types 1 and 9. In addition, 
we describe the genetic diversity of the cps loci of serotypes 
1, 2 and 9 among the 35 S. suis serotypes yet known. Type- 
specific probes are identified. Also, a type-specific PCR for 
for example serotype 9 is provided, being a rapid, reliable 

30 and sensitive assay, which is used directly on nasal or 
tonsillar swabs or other samples of infected or carrier 
animals . 

The invention also provides a probe or primer according to 
the invention further provided with at least one reporter 
35 molecule. Examples of reporter molecules are manifold and 

known in the art, for example a reporter molecule can comprise 
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additional nucleic acid provided with a specific sequence 
(e.g. oligo-dT) hybridising to a corresponding sequence to 
which hybridisation can easily be detected for example because 
it has been immobilised to a solid support. 
5 Yet other reporter molecules comprise chromophores , e.g. 
f luorochromes for visual detection, for example by light 
microscopy or fluorescent in situ hybridisation (FISH) 
techniques, or comprise an enzyme such as horseradish 
peroxidase for enzymatic detection, e.g in enzyme-linked 

10 assays (EIA) . Yet other reporter molecules comprise 

radioactive compounds for detection in radiation-based-assays. 

In a preferred embodiment of the invention, at least one 
probe or primer according to the invention is provided 
(labelled) with a reporter molecule and a quencher molecule, 

15 providing together with unlabeled probe or primer a PCR-based 
test allowing rapid detection of specific hybridisation. 

The invention further provides a diagnostic test or test kit 
comprising a probe or primer as provided by the invention. 
Such a test or test kit, for example a cross-hybridisation 

20 test or PCR-based test, is advantageously used in rapid 
detection and/or serotyping of Streptococcus suis . 
The invention furthermore provides a protein or fragment 
thereof encoded by a nucleic acid according to the invention. 
Examples of such a protein or fragment are for example 

25 proteins described in for example Table 2 of the description, 
for example a cpsA protein is provided functionally encoding 
regulation of capsular polysaccharide synthesis, whereas cpsB 
and cpsC are functionally involved in chain in chain length 
determination. Other proteins or functional fragments thereof 

30 as provided by the invention, such as cpsD, E, F, G, H, I, J, 
K and related proteins, are involved in polysaccharide 
biosynthesis, functioning for example as glucosyl- or 
glycosyltransf erase in polysaccharide biosynthesis of 
Streptococcus suis capsular antigen. 

35 The invention furthermore provides a method to produce a 

Streptococcus suis capsular antigen comprising using a protein 
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or functional fragment thereof as provided by the invention, 
and provides therewith a Streptococcus suis capsular antigen 
obtainable by such a method. A comparison of the predicted 
amino acid sequences of the cps2 genes with sequences found in 
the databases allowed the assignment of functions to the open 
reading frames. The central region contains the type specific 
glycosyltransf erases and the putative polysaccharide 
polymerase. This region is flanked by two regions encoding for 
proteins .with common functions, such as regulation and 
transport of polysaccharide across the membrane. 
Biosynthesis of Streptococcus capsular polysaccharide antigen 
using a protein or functional fragment thereof is 
advantageously used in chemo-enzymatic synthesis and the 
development of vaccines which offer protection against 
serotype-specif ic Streptococcal disease, and is also 
advantageously used in the synthesis and development of 
multivalent vaccines against Streptococcal infections. Such 
vaccines elicit anticapsular antibodies which confer 
protection. 

Furthermore, the invention provides an acapsular 
Streptococcus mutant for use in a vaccine, a vaccine strain 
derived thereof and a vaccine derived thereof. Surprisingly, 
and against the grain of common doctrine, the invention 
provides use of a Streptococcus mutant deficient in capsular 
expression in a vaccine. 

Acapsular Streptococcus mutants have long been known in 
the art and can be found in nature. Griffith (J. Hyg. 27:113- 
159, 1928) demonstrated that pneumococci could be transformed 
from one type to another. If he injected live rough (acapsular 
or unencapsulated) type 2 pneumococci into mice, the mice 
would survive. If, however, he injected the same dose of live 
rough type 2 mixed with heat-killed smooth (encapsulated) type 
1 into a mouse, the mouse would die, and from the blood he 
could isolate live smooth type 1 pneumococci. At that time, 
the significance of this transforming principle was not 
understood. However, understanding came when it was shown that 
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DNA constituted the genetic material responsible for 
phenotypic changes during transformation. 

Streptococcus mutants deficient in capsular expression 
are found in several forms. Some are fully deficient and have 
5 no capsule at all, others form a deficient capsule, 

characterised by a mutation in a capsular gene cluster. 
Deficiency can for instance include capsular formation wherein 
the organization of the capsular material has been re- 
arranged, as for example demnosrable by electron microscopy. 

10 Yet others have a nearly fully developed capsule which is only 
deficient in a particular sugar component. 

Now, after much advance of biotechnology and despite the 
fact that little is still known about the exact localisation 
and sequence of genes involved in capsular synthesis in 

15 Streptococci, it is possible to create mutants of 

Streptococci, for example by homologous recombination or 
transposon mutagenesis, which has for example been done for 
GAS (Wessels et al., PNAS 88:8317-8321, 1991), for GBS (Wesels 
et al., PNAS 86: 8983-8987, 1989), for S. suis (Smith, ID-DLO 

20 Annual report 1996, page 18-19; Charland et al., Microbiol. 
144:325-332, 1998) and for S. pneumonia (Kolkman et al., J. 
Bact. 178:3736-3741, 1996). Such recombinant derived mutants, 
or isogenic mutants, can easily be compared with the wild-type 
strains from which they have been derived. 

25 In a preferred embodiment, the invention provides use of 

a recombinant-derived Streptococcus mutant deficient in 
capsular expression in a vaccine. Recombinant techniques 
useful in producing such mutants are for example homologous 
recombination, transposon mutagenises, and others, whereby 

30 deletions, insertions or (point ) -mutations are introduced in 
the genome. Advantages of using recombinant techniques are the 
stability of the obtained mutants (especially with homologous 
recombination and double cross-over techniques), and the 
knowledge about the exact site of the deletion, mutation or 

35 insertion. 

In a much preferred embodiment, the invention provides a 
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stable mutant deficient in capsular expression obtainable for 
example through homologous recombination or cross over 
integration events. Examples of such a mutant can be found in 
the experimental part of this description, for example mutant 
5 lOcpsB or lOcpsEF is such a stable mutant as provided by the 
invention . 

The invention also provides a Streptococcus vaccine 
strain and vaccine that has been derived from a Streptococcus 
mutant deficient in capsular expression. In general, said 

10 strain or vaccine is applicable within the whole range of 

Streptococcal infections, be it for those with animals or man 
or with zoonotic infections. It is of course now possible to 
first select a common vaccine strain and derive a 
Streptococcus mutant deficient in capsular expression thereof 

15 for the selection of a vaccine strain and use in a vaccine 
according to the invention. 

In a preferred embodiment, the invention provides use 
of a Streptococcus mutant deficient in capsular expression in 
a vaccine wherein said Streptococcus mutant is selected from 

20 the group composed of Streptococcus group A, Streptococcus 
group B, Streptococcus suis and Streptococcus pneumonia . 
Herewith the invention provides vaccine strains and vaccines 
for use with these notoriously heterologous Streptococci, of 
which a multitude of serotypes exist. With a vaccine as 

25 provided by the invention that is derived from a specific 
Streptococcus mutant that deficient in capsular expression, 
the difficulties relating to lack of heterologous protection 
can be circumvented since these mutants do nor rely on 
capsular antigens per se to induce protection. 

30 In a preferred embodiment, said vaccine strain is 

selected for its ability to survive or even replicate in an 
immune-competent host or host cells and thus can persist for a 
certain period, varying from 1-2 days to more than one or two 
weeks, in a host, despite its deficient character. 

35 Although an immunodef icient host will support replication 

of a wide range of bacteria that are deficient in one or more 
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virulence factors, in general it is considered a 
characteristic of pathogenicity of Streptococci that they can 
survive for certain periods or replicate in a normal host or - 
host cells such as macrophages. For example, Wiliams and 
Blakemore (Neuropath. Appl. Neurobiol.: 16, 345-356, 1990; 
Neuropath. Appl. Neurobiol.: 16, 377-392, 1990; J. Infect. 
Dis.: 162, 474-481, 1990) show that both polymorphonuclear 
cells and macrophage cells are capable of phagocytosing 
pathogenic S. suis in pigs lacking anti-S. suis antibodies, 
only pathogenic bacteria could survive and multiply inside 
macrophages and the pig. 

In a preferred embodiment, the invention, however, 
provides a deficient or avirulent mutant or vaccine strain 
which is capable of surviving at least 4-5 days, preferably at 
least 8-10 days in said host, thereby allowing the development 
of a solid immune response to subsequent Streptococcus 
infection, 

Due to its persistent but avirulent character, a 
Streptococcus mutant or vaccine strain as provided by the 
invention is well suited to generate specific and/or long- 
lasting immune responses against Streptococcal antigens, 
moreover because possible specific immune responses of the 
host directed against a capsule are relatively irrelevant 
because a vaccine strain as provided by the invention is in 
general not recognised by such antibodies. 

In addition, the invention provides a Streptococcus 
vaccine strain according the invention which strain comprises 
a mutant capable of expressing a Streptococcus virulence 
factor or antigenic determinant. 

In a preferred embodiment, the invention provides a 
Streptococcus vaccine strain according to the invention which 
strain comprises a mutant capable of expressing a 
Streptococcus virulence factor wherein said virulence factor 
or antigenic determinant is selected from a group of cellular 
components, such as muramidase-released protein (MRP) 
extracellular factor (EF) and cell-membrane associated 
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proteins, 60kDA heat shock protein, pneumococcal surface 
protein A (Psp A) , pneumolysin, C protein, protein M, 
fimbriae, haemagglutinins and haemoiysin or components 
functionally related thereto. 

In a much preferred embodiment, the invention provides a 
Streptococcus vaccine strain according to the invention which 
strain comprises a mutant capable of over-expressing said 
virulence factor. In this way, the invention provides a 
vaccine strain for incorporation in a vaccine which 
specifically causes a host to provide a immune response 
directed against antigenically important determinants of 
virulence (listed above), thereby providing specific 
protection directed against said determinants. Over-expression 
can for example be achieved by cloning the gene involved 
behind a strong promoter, which is for example 
constitutionally expressed in a multicopy system, either in a 
plsamid or via intergration in a genome. 

In yet another embodiment, the invention provides a 
Streptococcus vaccine strain according to the invention which 
comprises a mutant capable of expressing a non- Streptococcus 
protein. Such a vector- Streptococcus vaccine strain allows, 
when used in a vaccine, protection against other pathogens 
than Streptococcus. 

Due to its persistent but avirulent character, a 
Streptococcus vaccine strain or mutant as provided by the 
invention is well suited to generate specific and long-lasting 
immune responses, not only against Streptococcal antigens, but 
also against other antigens when these are expressed by said 
strain. Especially antigens derived from another pathogen are 
now expressed without the detrimental effects of said antigen 
or pathogen which would otherwise have harmed the host. 

An example of such a vector is a Streptococcus vaccine 
strain or mutant wherein said antigen is derived from a 
pathogen, such. as Actinobacillus pleuropneumonia, 
Mycoplasmatae, Bordetella, Pasteurella, E. coli, Salmonella, 
Campylobacter, Serpulina and others. 
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The invention also provides a vaccine comprising a 
Streptococcus vaccine strain or mutant according to the 
invention and further comprising a pharmaceutically acceptable 
carrier or adjuvant. Carriers or adjuvants are well known in 
the art, examples are phosphate buffered saline, physiological 
salt solutions, (double-) oil-in-water-emulsions, 
aluminumhydroxide, Specol, block- or co-polymers, and others. 

A vaccine according to the invention can comprise a 
vaccine strain either in a killed or live form. For example, a 
killed vaccine comprising a strain having (over ) expressed a 
Streptococcal or heterologous antigen or virulence factor is 
very well suited for eliciting an immune response. In a 
preferred embodiment, the invention provides a vaccine wherein 
said strain is live, due to its persistent but avirulent 
character, a Streptococcus vaccine strain as provided by the 
invention is well suited to generate specific and long-lasting 
immune responses. 

Now that a Streptococcal vaccine is provided by the 
invention, the invention also provides a method for 
controlling or eradicating a Streptococcal disease in a 
population comprising vaccinating subjects in said population 
with a vaccine according to the invention. 

In a preferred embodiment, a method for controlling or 
eradicating a Streptococcal disease is provided comprising 
testing a sample, such as a blood sample, or nasal or throat 
swab, faeces, urine, or other samples such as can be sampled 
at or after slaughter, collected from at least one subject, 
such as an infant or a pig, in a population partly or wholy 
vaccinated with a vaccine according to the invention for the 
presence of encapsulated Streptococcal strains or mutants. 
Since a vaccine strain or mutant according to the invention is 
not pathogenic, and can be distinguished from wild-type 
strains by capsular expression, the detection of (fully) 
encapsulated Streptococcal strains indicates that wild-type 
infections are still present. Such wild-type infected subjects 
can than be isolated from the remainder of the population 



18 

WO 00/05378 PCT/N L99/00460 

until the infection has passed away. With domestic animals, 
such as pigs, it is even possible to remove the infected 
subject from the population as a whole by culling. Detection 
of wild-type strains can be achieved via traditional culturing 
5 techniques, or by rapid detection techniques such as PCR 
detection . 

In yet another embodiment, the invention provides a 
method for controlling or eradicating a Streptococcal disease 
comprising testing a sample collected from at least one 

10 subject in a population partly or wholly vaccinated with a 
vaccine according to the invention for the presence of 
capsule-specific antibodies directed against Streptococcal 
strains. Capsule specific antibodies can be detected with 
classical techniques known in the art, such as used for 

15 Lancef ield' s group typing or serotyping. 

A much preferred embodiment of a method provided by the 
invention for controlling or eradicating a Streptococcal 
disease in a population comprises vaccinating subjects in said 
population with a vaccine according to the invention and 

20 testing a sample collected from at least one subject in said 
population for the presence of encapsulated Streptococcal 
strains and/or for the presence of capsule-specific antibodies 
directed against Streptococcal strains. 

For example, a method is provided according to the 

25 invention wherein said Streptococcal disease is caused by 
Streptococcus suis . 

The invention also provides a diagnostic assay for testing a 
sample for use in a method according to the invention 
comprising at least one means for the detection of 
30 encapsulated Streptococcal strains and/or for the detection of 
capsule-specific antibodies directed against Streptococcal 
strains . 

The invention furthermore provides a vaccine comprising an 
antigen according to the invention and further comprising a 
35 suitable carrier or adjuvant. The immunogenicity of a capsular 
antigen provided by the invention is for example increased by 
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linking to a carrier (such as a carrier protein), allowing the 
recruitment of T-cell help in developing an immune response. 

The invention further provides a recombinant micro- 
organism provided with at least a part of a capsular gene 

5 cluster derived from Streptococcus suis. The invention 

provides for example a lactic acid bacterium provided with at 
least a part of a capsular gene cluster derived from 
Streptococcus suis. Various food-grade lactic acid bacteria 
(Lactococcus lactis, Lactobacillus casei, Lactobacillus 

10 plantarium and Streptococcus gordonii) have been used as 
delivery systems for mucosal immunization. It has now been 
shown that oral (or mucosal) administration of recombinant L. 
lactis, Lactobacillus, and Streptococcus gordonii can elicit 
local IgA and /or IgG antibody responses to an expressed 

15 antigen. The use of oral routes for immunization against 
infective diseases is desirable because oral vaccines are 
easier to administer, have higher compliance rates, and 
because mucosal surfaces are the portals of entry for many 
pathogenic microbial agents. It is within the skill of the 

20 artisan to provide such micro-organisms with (additional) 
genes . 

The invention further provides a recombinant 
Streptococcus suis mutant provided with a modified capsular 
gene cluster. It is within the skill of the artisan to swap 

25 genes within a species. In a preferred embodiment, an 

avirulent Streptococcus suis mutant is selected to be provided 
with at least a part of a modified capsular gene cluster 
according to the invention. 

The invention further provides a vaccine comprising a micro- 

30 organism or a mutant provided by the invention. An advantage 
of such a vaccine over currently used vaccines is that they 
comprise accurately defined micro-organisms and well- 
characterised antigens, allowing accurate determination of 
immune responses against various antigens of choice. 

35 The invention is further explained in the experimental part 
of this description without limiting the invention thereto. 
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Experimental part 
MATERIAL AND METHODS 

5 

Bacterial strains and growth conditions. 

The bacterial strains and plasmids used in this study are 
listed in Table 1. S. suis strains were grown in Todd-Hewitt 
broth (code CM189, Oxoid) , and plated on Columbia agar blood 

10 base (code CM331, Oxoid) containing 6% (v/v) horse blood. 
E.coli strains were grown in Luria broth (28) and plated on 
Luria broth containing 1.5% (w/v) agar. If required, 
antibiotics were added to the plates at the following 
concentrations: spectinomycin : 100 ug/ml for S. suis and 50 

15 ug/ml for E. coli and ampicillin, 50 ug/ml. 

Serotyping. The S.suis strains were serotypes by the slide 
agglutination test with serotype-specif ic antibodies (44). 
DNA techniques. Routine DNA manipulations were performed as 
described by Sambrook et al. (36). 

20 Alkaline phosphatase activity. To screen for PhoA fusions in 
E.coli, plasmid libraries were constructed. Therefore, 
chromosomal DNA of S. suis type 2 was digested with Alul . The 
300-500-bp fragments were ligated to S/nal-digested pPHOS2. 
Ligation mixtures were transformed to the PhoA~ E. coli strain 

25 CC118. Transformants were plated on LB media supplemented with 
5-Bromo-4-chloro-3-indolylfosfaat (BCIP, 50 ug/ml, Boehringer, 
Mannheim, Germany) . Blue colonies were purified on fresh 
LB/BCIP plates to verify the blue phenotype. 

DNA sequence analysis. DNA sequences were determined on a 373A 
30 DNA Sequencing System (Applied Biosystems, Warrington, GB) . 
Samples were prepared by use of a ABI/PRISM dye terminator 
cycle sequencing ready reaction kit (Applied Biosystems) . 
Sequencing data were assembled and analyzed using the 
MacMollyTetra program. Custom-made sequencing primers were 
35 purchased from Life Technologies. Hydrophobic stretches within 
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proteins were predicted by the method of Klein et al. (17). The 
BLAST program available on Netscape Navigator™ was used to 
search for protein sequences related to the deduced amino acid 
sequences . 

5 Construction of gene-specific knock-out mutants of S. suis. To 

construct the mutant strains lOcpsB and lOcpsEF we 
electrotransf ormed the pathogenic serotype 2 strain 10 
(45, 49) of S. suis with pCPSll and pCPS28 respectively. In 
these plasmids the cpsB and cpsEF genes were disturbed by the 

10 insertion of a spectinomycin-resistance gene. To create pCPSll 
the internal 400 bp Pstl-BamHI fragment of the cpsB gene in 
pCPS7 was replaced by the Spc R gene. For this purpose pCPS7 was 
digested with PstI and BamHI and ligated to the 1,200-bp Psti- 
BamHI fragment, containing the Spc R gen, from pIC-spc. To ■ 

15 construct pCPS28 we have used pIC20R. In this plasmid we 
inserted the KpnI-Sall fragment from pCPS17 (resulting in 
pCPS25) and the Xbal-Clal fragment from pCPS20 (resulting in 
pCPS27) . pCPS27 was digested with PstI and Xhol and ligated to 
the 1,200-bp Pstl-Xhol fragment, containing the Spc R gene of 

20 pIC-spc. The electrotransf ormation to S. suis was carried out 
as described before (38) . 

Southern blotting and hybridization. Chromosomal DNA was 
isolated as described by Sambrook et al. (36) . DNA fragments 
were separated on 0.8% agarose gels and transferred to Zeta- 

25 Probe GT membranes (Bio-Rad) as described by Sambrook et al. 
(36). DNA probes were labelled with [( - 32 P]dCTP (3000 Ci 
mmol~l; Amersham) by use of a random primed labelling kit 
(Boehringer) . The DNA on the blots was hybridized at 65°C with 
appropriate DNA probes as recommended by the supplier of the 

30 Zeta-Probe membranes. After hybridization, the membranes were 
washed twice with a solution of 40 mM sodium phosphate, pH 7.2, 
1 mM EDTA , 5% SDS for 30 min at 65°C and twice with a solution 
of 40 mM sodium phosphate, pH 7.2, 1 mM EDTA, 1% SDS for 30 min 
at 65°C. 

35 PCR. The primers used in the cps2J PCR correspond to the 
positions 13791-13813 and 14465-14443 in the S. suis cps2 
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locus. The sequences were: 5' -CAAACGCAAGGAATTACGGTATC-3' and 
5' -GAGTATCTAAAGAATGCCTATTG-3' . The primers used for the cpsll 
PCR correspond to the positions 4398-4417 and 4839-4821 in the 
S. suis cpsl sequence. The sequences were: 5'-- 
5 GGCGGTCTAGCAGATGCTCG- 3 ' and 5' -GCGAACTGTTAGCAATGAC-3 ' . The 
primers used in the cps9H PCR correspond to the positions 
4406-4126 and 4494-4475 in the S. suis cps9 sequence. The 
sequences were: 5' -GGCTACATATAATGGAAGCCC3' and 5'- 
CGGAAGTATCTGGGCTACTG- 3 9 . 

10 Construction of gene-specific knock-out mutants of S. suis. To 
construct the mutant strains lOcpsB and lOcpsEF we 
electrotransf ormed the pathogenic serotype 2 strain 10 
of S. suis with pCPSll and pCPS28 respectively. In these 
plasmids the cpsB and cpsEF genes were disturbed by the 

15 insertion of a spectinomycin-resistance gene. To create pCPSll 
the internal 400 bp Pstl-BamRI fragment of the cpsB gene in 
pCPS7 was replaced by the Spc R gene. For this purpose pCPS7 was 
digested with PstI and BamHI and ligated to the 1,200-bp Pstl- 
BamHI fragment, containing the Spc R gen, from pIC-spc. To 

20 construct pCPS28 we have used pIC20R. In this plasmid we 
inserted the KpnI-Sall fragment from pCPS17 (resulting in 
pCPS25) and the XJbal-Clal fragment from pCPS20 (resulting in 
pCPS27) . pCPS27 was digested with PstI and Xhol and ligated to 
the 1,200-bp Pstl-Xhol fragment, containing the Spc R gene of 

25 pIC-spc. The electrotransf ormation to S. suis was carried out 
as described before (38) . 

Phagocytosis assay. Phagocytosis assays were performed as 
described by Leij et al. (23). Briefly, to opsonize the cells, 
10 7 S. suis cells were incubated with 6% SPF-pig serum for 30 

30 min at 37°C in a head-over-head rotor at 6 rpm. 10^ AM and 10 7 
opsonized S. suis cells were combined and incubated at 37°C 
under continuous rotation at 6 rpm. At 0, 30, 60 and 90 min, 1- 
ml samples were collected and mixed with 4 ml of ice-cold EMEM 
to stop phagocytosis. Phagocytes were removed by centrif ugation 

35 for 4 min at 110 x g and 4°C. The number of colony forming 
units (CFU) in the supernatants ■ was determined. Control 
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experiments were carried out simultaneously by combining 10^ 
opsonized S. suis cells with EMEM (without AM) . 
Killing assays. AM (10 7 /ml) and opsonized S. suis cells 
(10 7 /ml) were mixed 1 : 1 and incubated for 10 min at 37°C 
5 under continuous rotation at 6 rpm. Ice-cold EMEM was added to 
stop further phagocytosis and killing. To remove extracellular 
S. suis cells, phagocytes were washed twice (4 min, 110 x g, 
4°C) and resuspended in 5 ml EMEM containing 6% SPF serum. The 
tubes were incubated at 37°C under rotation at 6 rpm. After 0, 

10 15, 30, 60 and 90 min, samples were collected and mixed with 
ice-cold EMEM to stop further killing. The samples were 
centrifuged for 4 min at 110 x g at 4°C and the phagocytic 
cells were lysed in EMEM containing 1% saponine for 20 min at 
room temperature. The number of CFU in the suspensions was 

15 determined. 

Pigs. Germfree pigs, cross-breeds of Great Yorkshire and Dutch 
landrace, were obtained from sows by caesarian sections. The 
surgery was performed in sterile flexible film isolators. Pigs 
were allotted to groups, each consisting of 4 pigs, and were 

20 housed in sterile stainless steel incubators. 

Experimental infections. Pigs were inoculated intranasally with 
S. suis type 2 as described before. To predispose the pigs for 
infection with 5. suis, five-day old pigs were inoculated 
intranasally with about 10 7 CFU of Bordetella bronchiseptica 

IS strain 92932. Two days later the pigs were inoculated 

intranasally with S. suis type 2 (10^ CFU) . Pigs were monitored 
twice daily for clinical signs of disease, such as fever, 
nervous signs and lameness. Blood samples were collected three 
times a week from each pig. White blood cells were counted with 

30 a cell counter. To monitor infection with S. suis and B. 

bronchiseptica and to check for absence of contaminants, we 
collected swabs of nasopharynx and feces daily. The swabs were 
plated directly onto Columbia agar containing 6% horse blood. 
After three weeks the pigs were killed and examined for 

35 pathological changes. Tissue specimens from the central nervous 
system, serosae, and joints were examined bacteriologically and 
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histologically as described before (45, 49) . Colonization of 
the serosae was scored positively when S. suis was isolated 
from the pericardium, thoracal pleura or the peritoneum. 
Colonization of the joints was scored positively when S. suis 
5 was isolated from one or more joints (12 joints per animal were 
scored) . 

Vaccination and challenge 

One week old pigs were vaccinated intravenously with a dosage 
of 106 cfu of the S. suis strains lOcpsEF or lOcpsB. Three 

10 weeks later the pigs were challenged intravenously with the 
pathogenic serotype 2 strain 10 (107 cfu) . Disease monitoring, 
haematological, serological and bacteriological examinations as 
well as post-mortum examinations were as described before under 
experimental infections . 

15 Electron Microscopy. Bacteria were prepared for electron 
microscopy as described by Wagenaar et al . (50). Shortly, 
bacteria were mixed with agarose MP (Boehringer) of 37° C to a 
concentration of 0.7%. The mixture was immediately cooled on 
ice. Upon gelifying, samples were cut into 1 to 1.5 mm slices 

20 and incubated in a fixative containing 0.8% glutaraldehyde and 
0.8% osmiumtetraoxide. Subsequently, the samples were fixed 
and stained with uranyl acetate by microwave stimulation, 
dehydrated and imbedded in eponaraldite resin. Ultra-thin 
sections were counterstained with lead citrate and examined 

25 with a Philips CM 10 electron microscope at 80 kV. 

Isolation of porcine alveolar macrophages (AM) . Porcine AM were 
obtained from the lungs of specific pathogen free- (SPF) pigs. 
Lung lavage samples were collected as described by van Leengoed 
et al. (43) . Cells were suspended in EMEM containing 6% (v/v) 

30 SPF-pig serum and adjusted to 10 7 cells per ml. 
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Identification of the cps locus . 

The cps locus of S.suis type 2 was identified by making use of 
5 a strategy developed for the genetic identification of exported 
proteins (13, 31) . In this system we made use of a plasmid 
(pPH0S2) containing a truncated alkaline phosphatase gene (13). 
The gene lacked the promoter sequence, the translational start 
site and the signal sequence. The truncated gene is preceded by 

10 a unique Smal restriction site. Chromosomal DNA of S. suis type 
2, digested with Alul, was randomly cloned in this restriction 
site. Because translocation of PhoA across the cytoplasmic 
membrane of E. coli is required for enzymatic activity, the 
system can be used to select for S. suis fragments containing a 

15 promoter sequence, a translational start site and a functional 
signal sequence. Among 560 individual £. coli clones tested, 16 
displayed a dark blue phenotype when plated on media containing 
BCIP. DNA sequence analysis of the inserts from several of 
these plasmids were performed (results not shown) and the 

20 deduced amino acid sequences were analyzed. The hydrophobicity 
profile of one of the clones (pPHOS7, results not shown) showed 
that the N-terminal part of the sequence resembled the 
characteristics of a typical signal peptide: a short 
hydrophilic N-terminal region is followed by a hydrophobic 

25 region of 38 amino acids. These data indicate that the phoA 
system was successfully used for the selection of S. suis 
genes encoding exported proteins. Moreover, the sequences were 
analyzed for similarities present in the databases. The 
sequence of pPH0S7 showed a high similarity (37% identity) with 

30 the protein encoded by the cps!4C gene of Streptococcus 

pneumoniae (19) . This strongly suggests that pPHOS7 contains a 
part of the cps operon of S. suis type 2. 

Cloning of the flanking cps genes. In order to clone the 
flanking cps genes of S. suis type 2 the insert of pPHOS7 was 
35 used as a probe to identify chromosomal DNA fragments which 
contain flanking cps genes. A 6-kb Hindi I I fragment was 
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identified and cloned in pKUN19. This yielded clone pCPS6 (Fig. 
1C) . Sequence analysis of the insert of pCPS6 revealed that 
pCPS6 most probably contained the 5 1 -end of the cps locus, but 
still lacked the 3* -end. Therefore, sequences of the 3' -end of 
5 pCPS6 were in turn used as a probe to identify chromosomal 

fragments containing cps sequences located further downstream. 
These fragments were also cloned in pKUN19, resulting in 
pCPS17. Using the same system of chromosomal walking we 
subsequently generated the plasmid pCPS18, pCPS20, pCPS23 and 

10 pCPS26, containing downstream cps sequences. 

Analysis of the cps operon. The complete nucleotide sequence of 
the cloned fragments was determined (figure 4). Examination of 
the compiled sequence revealed the presence of at least 13 
potential open reading frame (Orfs) , which were designated as 

15 Orf 2Y, Orf2X and Cps2A-Cps2K (Fig. 1A) . Moreover, a 14th, 
incomplete, Orf (Orf 2Z) was located at the 5 r -end of the 
sequence. Two potential promoter sequences were identified. One 
was located 313 bp (locations 1885-1865 and 1884-1889) 
upstream of Orf2X. The other potential promoter sequence was 

20 located 68 bp upstream of Orf2Y (locations 2241-2236 and 2216- 
2211) . Orf2Y is expressed in opposite orientation. Between Orfs 
2Y and 2Z the sequence contained a potential stem-loop 
structure, which could act as a transcription terminator. Each 
Orf is preceded by a ribosome-binding site and the majority of 

25 the Orfs are very closely linked. The only significant 
intergenic gap was found between Cps2G and Cps2H (389 
nucleotides) . However, no obvious promoter sequences or 
potential stem-loop structures were found in this region. These 
data suggest that 0rf2X and Cps2A-Cps2K are arranged as an 

30 operon. 

An overview of all Orfs with their properties is shown in 
Table 2. The majority of the predicted gene products is related 
to proteins involved in polysaccharide biosynthesis. Orf2Z 
showed some similarity with the YitS protein of Bacillus 
35 subtilis. YitS was identified during the sequence analysis of . 
the complete genome of B. subtilis. The function of the protein 



27 

WO 00/05378 PCT/NL99/00460 

is unknown. 

0rf2Y showed similarity with YcxD protein of B. subtilis 
(53) . Based on the similarity between YcxD and MocR of 
Rhizobium meliloti (33), YcxD was suggested to be a regulatory 
5 protein. 

Orf2X showed similarity with the hypothetical YAAA proteins 
of Haemophilus influenzae and E. coll. The function of these 
proteins is unknown. 

The gene products encoded by the cps2A, cps2B t cps2C and 

10 cps2D genes showed approximate similarity with the CpsA, CpsC, 
CpsD and CpsB proteins of several serotypes of Streptococcus 
pneumoniae (19), respectively. This suggest similar functions 
for these proteins. Hence, Cps2A may have a role in the 
regulation of the capsular polysaccharide synthesis. Cps2B and 

15 Cps2C could be involved in the chain length determination of 

the type 2 capsule and Cps2C can play an additional role in the 
export of the polysaccharide. The Cps2D protein of S. suis is 
related to the CpsB protein of S. pneumoniae and to proteins 
encoded by genes of several other Gram-positive bacteria 

20 involved in polysaccharide or exopolysaccharide synthesis, but 
their function is unknown (19). 

The protein encoded by cps2E gene showed similarity to 
several bacterial proteins with glycosyl transferase 
activities: Cpsl4E and Cpsl9fE of S. pneumoniae serotypes 14 

25 and 19F (18, 19, 29), CpsE of Streptococcus salvarius (X94980) 
and CpsD of Streptococcus agalactiae (34) . Recently, Kolkman et 
al. (18) showed that Cpsl4E is a glucosyl-l-phosphate 
transferase that links glucose to a lipid carrier, the first 
step in the biosynthesis of the S. pneumoniae type 14 repeating 

30 unit. Based on these data a similar function may be fulfilled 
by Cps2E of S. suis . 

The protein encoded by the cps2F gene showed similarity to 
the protein encoded by the rfbU gene of Salmonella enteritica . 
(25) . This similarity is most pronounced in the C-terminal 

35 regions of these proteins. The rfbU gene was shown to encoded 
mannosyltransf erase activity (25) . 
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The cps2G gene encoded a protein that showed moderate 
similarity with the rfbF gene product of Campylobacter hyoilei 
(22), the epsF gene product of S. thermophilus (40) and the 
capM gene product of S. aureus (24). On the basis of 
5 similarity the rfbF, epsF and capM genes are suggested to 
encoded galactosyltransf erase activities. Hence, a similar 
glycosyl transferase activity could be fulfilled by the cps2G 
gene product . 

The cps2H gene encodes a protein that is similar to the N- 
10 terminal region of the IgtD gene product of Haemophilus 

influenzae (U32768) . Moreover, the hydrophobicity plots of 
Cps2H and LgtD looked very similar in these regions (data not 
shown) . Based on sequence similarity the IgtD gene product was 
suggested to have glycosyl transferase activity (U32768) . 
15 The gene product encoded by the cps2I gene showed some 
similarity with a protein of Actinobacillus 

actinomycetemcomitans (AB002668) . This protein is part of the 
gene cluster responsible for the serotype-b-specif ic antigen of 
A. actimycetemcomitans . The function of the protein is unknown. 

20 The gene products encoded by the cps2J and cps2K genes 

showed significant similarities to the Cpsl4J protein of S. 
pneumoniae . The cpsl4J gene of S. pneumoniae was shown to 
encode a B-l, 4-galactosyltransf erase activity. In S. 
pneumoniae CpsJ is responsible for the addition of the fourth 

25 (i.e. last) sugar in the synthesis of the S. pneumoniae 

serotype 14 polysaccharide (20) . Even some similarity was 
found between Cps2J and Cps2K (Fig. 2, 25.5% similarity). This 
similarity was most pronounced in the N-terminal regions of the 
proteins. Recently, two small conserved regions were identified 

30 in the N-terminus of Cpsl4J and Cpsl4I and their homologues 
(20) . These regions were predicted to be important for 
catalytic activity. Both regions, DXS and DXDD (Fig. 2), were 
also found in Cps2J and Cps2K. 
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Distribution of the cps2 genes in other S. suis serotypes. To 

examine the relationship between the cps2 genes and cps genes 
in the other S. suis serotypes, we performed cross- 
hybridization experiments. DNA fragments of the individual 

5 cps2 genes were amplified by PCR, labelled with 32 P, and used 
to probe Southern blots of chromosomal DNA of the reference 
strains of the 35 different S. suis serotypes. Large variation 
in the hybridization patterns were observed (Table 4) . As a 
positive control we used a probe specific for 16S rRNA. The 

10 16S rRNA probe hybridized with all serotypes tested. However, 
none of the other genes tested were common in all serotypes. 
Based on the genetic organization of the genes we previously 
suggested that orfX and cpsA-cpsK genes are part of one operon 
and that the protein encoded by these genes are all involved 

15 in polysaccharide biosynthesis. OrfY and OrfZ are not a part 
of this operon, and their role in the polysaccharide 
biosynthesis is unclear. Based on sequence similarity data, 
OrfY may be involved in regulation of the cps2 genes. OrfZ is 
proposed to be unrelated to polysaccharide biosynthesis. 

20 Probes specific for the orfZ, orfY, orfX, cpsA, cpsB, cpsC and 
cpsD genes hybridized with most other serotypes. This suggests 
that the protein encoded by these genes are not type-specific, 
but may perform more common functions in biosynthesis of the 
capsular polysaccharide. This confirms previous data which 

25 showed that the cps2A-cps2D genes showed strong similarity to 
cps genes of several serotype of Streptococcus pneumoniae. 
Based on this similarity Cps2A is possibly a regulatory 
protein, whereas Cps2B and Cps2C may play a role in length 
determination and export of polysaccharide. The cps2E gene 

30 hybridized with DNA of serotypes 1, 2, 14 and 1/2. The cps2E 
gene showed a strong similarity to the cps!4E gene of S. 
pneumoniae (18) . This enzyme was shown to have a glucosyl-1- 
phosphate activity and catalyzed the transfer of glucose to a 
lipid carrier (18) . These data indicate that a 

35 glycosyltransf erase closely related to Cpsl4E may be 
responsible for the first step in the biosynthesis of 
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polysaccharide in the S. suis serotypes 1, 2, 14 and 1/2. The 
cps2F, cps2G, cps2H, cps2I and cps2J genes hybridized with 
chromosomal DNA of serotypes 2 and 1/2 only. The cps2G gene 
showed an additional weak hybridization signal with DNA of 
5 serotype 34. In agglutination tests serotype 1/2 showed 
agglutination with sera specific for serotype 2 as well as 
with sera specific for serotype 1. This suggests that serotype 
1/2 shares antigenic determinants with both types 1 and 2. The 
hybridization data confirmed these data. All putative 

10 glycosyltransf erases present in serotype 2 are also present in 
serotype 1/2. The cps2K gene showed a similar hybridization 
pattern as the cps2E gene. Hybridization was observed with DNA 
of serotypes 1, 2, 14 and 1/2. Taken together these 
hybridization data show that the cps2 gene cluster can be 

15 divided in three regions: a central region containing the 
type-specific genes is flanked by two regions containing 
common genes for various serotypes. 

Cloning of the type-specific cps genes of serotypes 1 and 9. 

20 To clone the type-specific cps genes of S. suis serotype 1 we 
used the cps2E gene as a probe to identify chromosomal DNA 
fragments of type 1 which contain flanking cps genes. A 5 kb 
EcoRV fragment was identified and cloned in pKUN19. This 
yielded pCPSl-1 (Fig. IB) . This fragment was in turn used as a 

25 probe to identify an overlapping 2.2 kb Hindi I I fragment. 

pKUN19 containing this Hindi I I fragment was designated pCPSl- 
2. The same strategy was followed to identify and clone the 
type-specific cps genes of serotype 9. In this case, we used 
the cps2D gene as a probe. A 0 . 8 kb JJindlll-Xbal fragment was 

30 identified and cloned, yielding pCPS9-l (Fig. 1C) . This 

fragment was, in turn used as a probe to identify a 4 kb X£>al 
fragment. pKUN19 containing this 4 kb XJbal fragment was 
designated pCPS9-2. 
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Analysis of the cloned cpsl genes. The complete nucleotide 
sequence of the inserts of pCPSl-1 and pCPSl-2 was determined 
(figure 5) . Examination of the sequence revealed the presence 
of five complete and two incomplete Orfs (Fig. IB). Each Orf 
5 is preceded by a ribosome-binding site. In accord with data 
obtained for the cps2 genes of serotype 2, the majority of the 
Orfs is very closely linked. The only significant gap (718 bp) 
was found between CpslG and CpslH. No obvious promoter 
sequences or potential stem-loop structures could be found in 

10 this region. This suggests that, as in serotype 2, the cps 
genes in serotype 1 are arranged in an operon. 

An overview of the Orfs and their properties in shown in 
Table 2. As expected on the basis of the hybridization data 
(Table 4), the protein encoded by the cpslE gene was related 

15 to Cps2E of S. suis type 2 (identity of 86%). The fragment 
cloned in pCPSl-1 lacked the coding region for the first 7 
amino acids of the cps IE gene. 

The protein encoded by the cpslF and cpslG genes showed 
strong similarity to the Cpsl4F and Cpsl4G proteins of 

20 Streptococcus pneumoniae serotype 14, respectively (20) . The 
function of the Cpsl4F is not completely clear, but it has 
been suggested that Cpsl4F can enhance role in 

glycosyltransf erase activity. The cpsl4G gene of S. pneumoniae 

was shown to encode 15-1 , 4-galactosyltransf erase activity. In 
25 5. pneumoniae type 14 this activity is required for the second 

step in the biosynthesis of the oligosaccharide subunit (20) . 

Based on the similarity data found similar glycosyltransf erase 

and enhancing activities are suggested for the cps 1G and 

cpslF genes of S. suis type 1. 
30 The protein encoded by the cpslH gene showed similarity to 

the Cpsl4H protein of S. pneumoniae (20) . Based on sequence 

similarity Cpsl4H was proposed to be the polysaccharide 

polymerase (20) . 

The protein encoded by the cpsll gene showed some 
35 similarity with the Cpsl4J protein of S. pneumoniae (19). The 

cps!4J gene was shown to encode a B-l, 4-galactosyltransf erase 
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activity, responsible for the addition of the fourth (i.e. 
last) sugar in the synthesis of the S. pneumoniae serotype 14 
polysaccharide . 

Between CpslG and CpslH a gap of 718 bp was found. This 

5 region revealed three small Orfs. The three Orfs were 

expressed in three different reading frames and were not 
preceded by potential ribosome binding sites, nor contained 
potential start sites. However, the three potential gene 
products encoded by this region showed some similarity with 

10 three successive regions of the C-terminal part of the EpsK 
protein of Streptococcus thermophilus (27% identity, 40). The 
region related to the first 82 amino acids is lacking. 



Analysis of the cloned cps9 genes. We also determined the 
15 complete nucleotide sequence of the inserts of pCPS9-l and 

pCPS9-2 (figure 6) . Examination of the sequence revealed the 

presence of three complete and two incomplete Orfs (Fig.lC). 

As in serotypes 1 and 2, all Orfs are preceded by a ribosome- 

binding site and are very closely coupled. As suggested by the 
20 hybridization data (Table 4) the Cps2D and Cps9D proteins were 

highly related (Table 2) . Based on sequence comparisons pCPS9- 

1 lacked the first 27 amino acids of the Cps9D protein. 
The protein encoded by the cps9E gene showed some 

similarity with the CapD protein of Staphylococcus aureus 
25 serotype 1 (24) . Based on sequence similarity data the CaplD 

protein was suggested to be an epimerase or a dehydratase 

involved in the synthesis of N-acetylf ructosamine or N- 

acetylgalactosamine (63) . 

Cps9F showed some similarity to the CapM proteins of S. 
30 aureus serotypes 5 and 8 (61, 64, 65). Based on sequence 

similarity data Cap5M and Cap8M are proposed to be 

glycosyltransf erases (63) . 

The protein encoded by the cps9G gene showed some 

similarity with a protein of Actinobacillus 
35 act inomycetemcomi tans (AB002668_4) . This protein is part of a 

gene cluster responsible for the serotype-b specific antigens 
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of Actinobacillus actinomycetemcomitans . The function of the 
protein is unknown. 

The protein encoded by the cps9H gene showed some 
similarity with the rfbB gene of Yersinia enterolitica (68) . 
5 The RfbB protein was shown to be essential for O-antigen 

synthesis, but the function of the protein in the synthesis 
of the 0:3 lipopolysaccharide is unknown. 



Serotype 1 and serotype 9 specific cps genes . To determine 

10 whether the cloned fragments in pCPSl-1, pCPSl-2, pCPS9-l and 
pCPS9-2 contained the type-specific genes for serotype 1 and 
9, respectively, cross hybridization experiments were 
performed. DNA fragments of the individual cpsl and cps9 genes 
were amplified by PCR, labelled with 32 P, and used to probe 

15 Southern blots of chromosomal DNA of the reference strains of 
the 35 different S. suis serotypes. The results are shown in 
Table 5. Based on the data obtained with the cps2E probe 
(Table 4), the cpslE probe was expected to hybridize with 
chromosomal DNA of S. suis serotypes 1,2, 14, 27 and 1/2. The 

20 cpslH, cps9E and cps9F probes hybridized with most other 
serotypes. However, the cpslF and cpslG and cpsll probes 
hybridized with chromosomal DNA of serotypes 1 and 14 only. 
The cps9G and cps9H probe hybridized with serotype 9 only. 
These data suggest that the cps9G and cps9H probes are 

25 specific for serotype 9 and therefore could be useful tools 
for the development of rapid and sensitive diagnostic tests 
for S. suis type 9 infections. 

Type specific PCR. So far, the probes were tested on the 35 
30 different reference strains only. To test the diagnostic value 
of the type-specific cps probes further, several other S. suis 
serotype 1, 2, 1/2, 9 and 14 strains were used. Moreover, 
since a PCR based method would be even more rapid and 
sensitive than a hybridization test, we tested whether we 
35 could use a PCR for the serotyping of the S. suis strains. The 
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oligonucleotide primer sets were chosen within the cps2J, 
cpsll and cps9H genes. Amplified fragments of 675 bp, 380 bp 
and 390 bp were expected respectively. The results show that 
675 bp fragments were amplified on type 2 and 1/2 strains 
5 using cps2J primers; 380 bp fragments were amplified on type 1 
and 14 strains using cpsll primers and 390 bp fragments were 
amplified on type 9 strains using cps9H primers . 

Construction of mutants impaired in capsule production. To 

10 evaluate the role of the capsule of S. suis type 2 in the 
pathogenesis, we constructed two isogenic mutants in which 
capsule production was disturbed. To construct mutant lOcpsB, 
pCPSll was used. In this plasmid a part of the cps2B gene was 
replaced by the spectinomycin-resistance gene. To construct 

15 mutant strain lOcpsEF the plasmid pCPS28 was used. In pCPS28 
the 3 '-end of cps2E gene as well as the 5' -end of cps2F gene 
were replaced by the spectinomycin-resistance gene. pCPSll and 
pCPS28 were used to electrotransf orm strain 10 of S. suis type 
2 and spectinomycin-resistant colonies were selected. Southern 

20 blotting and hybridization experiments were used to select 
double cross over integration events (results not shown) . 
To test whether the capsular structure of the strains lOcpsB 
and lOcpsEF was disturbed, we used a slide agglutination test 
using a suspension of the mutant strains in hyperimmune anti-S. 

25 suis type 2 serum (44) . The results showed that even in the 
absence of serotype specific antisera, the bacteria 
agglutinated. This indicates that in the mutant strains the 
capsular structure was disturbed. To confirm this, thin 
sections of wild type and mutant strains were compared by 

30 electron microscopy. The results showed that compared to the 
wild type (Fig. 3A) the amount of capsule produced by the 
mutant strains was greatly reduced (Figs. 3B and 3C) . Almost no 
capsular material could be detected on the surface of the 
mutant strains. 
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Capsular mutants are sensitive to phagocytosis and killing by 
porcine alveolar macrophages (PAM) , 

The capsular mutants were tested for their ability to resist 
phagocytosis by PAM in the presence of porcine SPF serum. The 

5 wild type strain 10 seemed to be resistant to phagocytosis 
under these conditions (Fig. 4A) . In contrast, the mutant 
strains were efficiently ingested by macrophages (Fig. 4A) . 
After 90 min. more than 99.7% (strain lOcpsB) and 99.8% (strain 
lOcpsEF) of the mutant cells were ingested by the macrophages. 

10 Moreover, as shown in Fig. 4B the ingested strains were 

efficiently killed by the macrophages. 90-98% of all ingested 
cells were killed within 90 min. No differences could be 
observed between wild type and mutant strains. These data 
indicate that the capsule of S. suis type 2 efficiently 

15 protects the organism from uptake by macrophages in vitro. 

Capsular mutants are less virulent for germf ree piglets . The 

virulence properties of the wild-type and mutant strains were 
tested after experimental infection of newborn germfree pigs 

20 (45, 49) . Table 1 shows that specific and nonspecific signs of 
disease could be observed in all pigs inoculated with the wild 
type strain. Moreover, all pigs inoculated with the wild type 
strain died during the course of the experiment or were killed 
because of serious illness or nervous disorders (Table 3) . In 

25 contrast, the pigs inoculated with strains lOcpsB and lOcpsEF 
showed no specific signs of disease and all pigs survived until 
the end of the experiment. The temperature of the pigs 
inoculated with the wild type strain increased 2 days after 
inoculation and remained high until day 5 (Table 3) . The 

30 temperature of the pigs inoculated with the mutant strains 
sometimes exceeded the 40°C, however, we could observe 
significant differences in the fever index [i.e % of 
observations in an experimental group during which pigs showed 
fever (>40°C) ] between pigs inoculated with wild type and 

35 mutant strains. All pigs showed increased numbers of. 

polymorphonuclear leucocytes (PMLs) (>10 x 10 9 PMLs per litre) 



36 

WO 00/05378 PCT/NL99/00460 

(Table 3) . However, in pigs inoculated with the mutant strains 
the percentage of samples with increased numbers of PMLs was 
considerably lower. S. suis strains and B. bronchiseptica could 
be isolated from the nasopharynx and feces swab samples of all 
5 pigs from 1 day post-infection until the end of the experiment 
(Table 3) . Postmortem, the wild type strain could frequently be 
isolated from the central nervous system (CNS), kidney, heart, 
liver , spleen, serosae, joints and tonsils. Mutant strains 
could easily be recovered form the tonsils, but were never 

10 recovered from the kidney, liver or spleen. Interestingly, low 
numbers of the mutant strains yere isolated from the CNS, the 
serosae, the joints, the lungs and the heart. Taken together, 
these data strongly indicated that mutant S. suis strains, 
impaired in capsule production, are not virulent for young 

15 germfree pigs. 

We describe the identification and the molecular 
characterisation of the cps locus, involved in the capsular 
polysaccharide biosynthesis, of S. suis Most of the genes 
seemed to belong to a single transcriptional unit, suggesting a 

20 co-ordinate control of these genes. We assign functions to most 
of the gene products. We thereby identified regions involved in 
regulation (Cps2A) , chain length determination (Cps2B, C) , 
export (Cps2C) and biosynthesis (Cps2E, F, G, H, J, K) . The 
region involved in biosynthesis is located at the centre of the 

25 gene cluster and is flanked by two regions containing genes 
with more common functions. The incomplete orf2Z gene was 
located at the 5 1 -end of the cloned fragment. Orf2Z showed some 
similarity with the YitS protein of B. subtilis. However, 
because the function of the YitS protein is unknown this did 

30 not give us any information about the possible function of 

Orf2Z. Because the orf2Z gene is not a part of the cps operon, 
a role of this gene in polysaccharide biosynthesis is not 
expected. The Orf2Y protein showed some similarity with the 
YcxD protein of S. subtilis (53) .The YcxD protein was suggested 

35 to be a regulatory protein. Similarly, Orf2Y may be involved in 
the regulation of polysaccharide biosynthesis. The Orf2X 
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protein showed similarity with the YAAA proteins of H. 
influenzae and E. coli. The function of these proteins is 
unknown. In S. suis type 2 the orf2X gene seemed to be the 
first gene in the cps2 operon. This suggests a role of Orf2X in 

5 the polysaccharide biosynthesis. In H. influenzae and E. coli, 
however, these proteins are not associated with capsular gene 
clusters. The analysis of isogenic mutants impaired in the 
expression of Orf2X should give more insight in the presumed 
role of Orf2X in the polysaccharide biosynthesis of S. suis 

10 type 2. 

The gene products encoded by the cps2E, cps2F , cps2G, cps2H, 
cps2J and cps2K genes showed little similarity with 
glycosyltransf erases of several Gram-positive or Gram-negative 
bacteria (18, 19, 20, 22, 25) . The cps2E gene product shows 

15 some similarity with the Cpsl4E protein of S. pneumoniae (18, 
19) . Cpsl4E is a glucosyl-l-phosphate transferase that links 
glucose to a lipid carrier (18). In S. pneumoniae this is the 
first step in the biosynthesis of the oligosaccharide repeating 
unit. The structure of the S. suis serotype 2 capsule contains 

20 glucose, galactose, rhamnose, N-acetyl glucoseamine and sialic 
acid in a ratio of 3:1:1:1:1 (7). Based on these data we 
conclude that Cps2E of S. suis has glucosyltransf erase 
activity, and is involved in the linkage of the first sugar to 
the lipid carrier. 

25 The C-terminal region of the cps2F gene product showed some 

similarity with the RfbU of Salmonella enteritica . RfbU was 
shown to have mannosyltransf erase activity (24) . Because 
mannosyl is not a component of the S. suis type 2 
polysaccharide a mannosyltransf erase activity is not expected 

30 in this organism. Nevertheless, cps2F encodes a 

glycosyltransf erase with another sugar specificity. 

Cps2G showed moderate similarity to a family of gene 
products suggested to encode galactosyltransf erase activities 
(22, 24, 40) . Hence a similar activity is shown for Cps2G. 

35 Cps2H showed some similarity with LgtD of H. influenzae 

(U32768) . Because LgtD was proposed to have glycosyltransf erase 
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activity , a similar activity is fulfilled by Cps2H. 

Cps2J and Cps2K showed similarity to Cpsl4J of S. pneumoniae 
(20) . Cps2J showed similarity with Cpsl4I of S. pneumoniae as 
well. Cpsl4I was shown to have N-acetyl glucosaminyltransf erase 
5 activity, whereas Cpsl4J has a B-l , 4-galactosyltransf erase 
activity (20) . In S. pneumoniae Cpsl4I is responsible for the 
addition of the third sugar and Cpsl4J for the addition of the 
last sugar in the synthesis of the type 14 repeating unit 
(20) . Because the capsule of S. suis type 2 contains galactose 
10 as well as N-acetyl glucosamine components, 
galactosyltransf erase as well as N-acetyl 

glucoaminyltransf erase activities could be envisaged for the 
cps2J and cps2K gene products, respectively. As was observed 
for Cpsl4I and Cpsl4J, the N-termini of Cps2J and Cps2K showed 

15 a significant degree of sequence similarity. Within the N- 

terminal domains of Cpsl4I and Cpsl4J, two small regions were 
identified, which were also conserved in several other 
glycosyltransf erases (22) . Within these two regions, two Asp 
residues were proposed to be important for catalytic activity. 

20 The two conserved regions, DXS and DXDD, were also found in 
Cps2J and Cps2K. 

The function of Cps2I remains unclear. Cps2I showed some 
similarity with a protein of A. actinomycetemcomitans . Although 
this protein part is of the gene cluster responsible for the 

25 serotype-B-specif ic antigens, the function of the protein is 
unknown. 

We further describe the identification and characterization 
of the cps genes specific for S. suis serotypes 1, 2 and 9. 
After the entire cps2 locus of S. suis serotype 2 was cloned 

30 and characterized, functions for most of the cps2 gene 

products could be assigned by sequence homologies. Based on 
these data the glycosyltransf erase activities, required for 
type specificity, could be located in the centre of the 
operon. Cross-hybridization experiments, using the individual 

35 cps2 genes as probes on chromosomal DNAs of the 35 different 
serotypes, confirmed this idea. The regions containing the 
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type-specific genes of serotypes 1 and 9 could be cloned and 
characterized, showing that an identical genetic organization 
of the cps operons of other S. suis serotypes exists. The 
cpslE, cpslF, cpslG, cpslH, and cpsll genes revealed a 

5 striking similarity with cpsl4 E, cpsl4F, cpsl4G, cpsl4H and 
cpsl4J genes of S. pneumoniae. Interestingly, S. pneumoniae 
serotype 14 is the serotype most commonly associated with 
pneumococcal infections in young children (54), whereas S. 
suis serotype 1 strains are most commonly isolated from 

10 piglets younger than 8 weeks (46) . In S. pneumoniae the 
cps!4E / cps!4G, cps!4I and cpsl4J encode the 

glycosyltransf erases required for the synthesis of the type 14 
tetrameric repeating unit, showing that the cpslE, cps 1G and 
cpsll genes encoded glycosyltransf erases . The precise 

15 functions of these genes as well as the substrate 

specificities of the enzymes can be established. In S. 
pneumoniae the cps!4E gene was shown to encode a glucosyl-1- 
phosphate transferase catalyzing the transfer of glucose to a 
lipid carrier. Moreover, cpsE-like genes were found in S. 

20 pneumoniae serotypes 9N, 13, 14, 15B, 15C, 18F, 18A and 19F 
(60) . CpsE mutants were constructed in the serotypes 9N, 13 , 
14 and 15B. All mutant strains lacked glucosyltransf erase 
activity (60) . Moreover, in all these S. pneumoniae serotypes 
the cpsE gene seemed to be responsible for the addition of 

25 glucose to the lipid carrier. Based on these data we suggest 
that in S. suis type 1 the cpslE gene may fulfil a similar 
function. The structure of the S. suis type 1 capsule is 
unknown, but it is composed of glucose, galactose, N-acetyl 
glucosamine, N-acetyl galactosamine and sialic acid in a ratio 

30 of 1: 2.4: 1: 1:1.4 (5). Therefore a role of a cps£-like 
glucosyltransf erase activity can easily be envisaged. CpsE 
like sequences were also found in serotypes 2, 1/2 and 14. 

For polysaccharide biosynthesis in S. pneumoniae type 14, 
transfer of the second sugar of the repeating unit to the 

35 first lipid-linked sugar is performed by the gene products of 
cps!4F and cps!4G (20). Similar to Cpsl4F and Cpsl4G, the S. 
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suis type 1 proteins CpslF and CpslG may act as one 
glycosyltransf erase performing the same reaction. Cpsl4F and 
Cpsl4G of S. pneumoniae showed similarity to the N-terminal 
half and C-terminal half of the SpsK protein of Sphingomonas 

5 (20, 67), respectively. This suggests a combined function for 
both proteins. Moreover, cpsl4F and cpsl4G like sequences were 
found in several serotypes of S. pneumoniae and these genes 
always seemed to exist together (60) . The same was observed 
for S. suis type 1. The cpslF and cpslG probes hybridized 

10 with type 1 and type 14 strains. 

According to the similarity found between the cpslH gene and 
the cps!4H gene of S. pneumoniae (20), cpslH is expected to 
encode a polysaccharide polymerase. 

The protein encoded by the cpsll gene showed some 

15 similarity with the Cpsl4J protein of S. pneumoniae (19) . The 
cps!4J gene was shown to encode a 11-1, 4-galactosyltransf erase 
activity, responsible for the addition of the fourth (i.e. 
last) sugar in the synthesis of the S. pneumoniae serotype 14 
polysaccharide. In S. suis type 2 the proteins encoded by the 

20 cps2J and cps2K genes showed similarity to the Cpsl4J protein. 
However, no significant homologies were found between Cps2J, 
Cps2K and Cpsll. In the N-terminal regions of Cpsl4J and 
Cpsl4I two small conserved regions, DXS and DXDD, were 
identified (19) . These regions seemed to be important for 

25 catalytic activity (13) . At the same positions in the sequence 
Cps2I contained the regions DXS and DXED. 

In the region between CpslG and CpslH three small Orfs were 
identified. Since the Orfs were expressed in three different 
reading frames, and did not contain potential start sites, 

30 expression is not expected. However, the three potential gene 
products encoded by this region showed some similarity with 
three successive regions of the C-terminal part of the EpsK 
protein of Streptococcus thermophilus (27% identity, 40) . The 
region related to the first 82 amino acids is lacking. The 

35 EpsK protein was suggested to play a role in the export of the 
exopolysaccharide by rendering the polymerized 
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exopolysaccharide more hydrophobic through a lipid 
modification. These data could suggest that the sequences in 
the region between CpslG and CpslH originated from epsfC-like 
sequence. Hybridization experiments showed that this epsK-like 

5 region is also present in other serotype 1 strains as well as 
in serotype 14 strains (results not shown) . 

The function of most of the cloned serotype 9 genes can be 
established. Based on sequence similarity data the cps9E and 
cps9F genes could be glycosyltransf erases (61, 24, 63, 64, 

10 65) . Moreover, the cps9G and cps9H genes showed similarity to 
genes located in regions involved in polysaccharide 
biosynthesis, but the function of these genes is unknown (68) . 

Cross-hybridization experiments using the individual cps2 f 
cpsl and cps9 genes as probes showed that the cps9G and cps9H 

15 probes specifically hybridized with serotype 9 strains. 

Therefore, these are useful as tools for the identification of 
S. suis type 9 strains both for diagnostic purposes as well as 
in epidemiological and transmission studies. We previously 
developed a PGR method which can be used to detect S. suis 

20 strains in nasal and tonsil swabs of pigs (62) . The method was 
for example used to identify pathogenic (EF-positive) strains 
of S. suis serotype 2 During the last years, beside S. suis 
type 2 strains, serotype 9 strains are frequently isolated 
from organs of diseased pigs. However, until now a rapid and 

25 sensitive diagnostic test was not available for type 9 

strains. Therefore, the type 9 specific probes or the type 9 
specific PCR is of great diagnostic value. The cpslF, cpslG 
and cpsll probes hybridized with serotype 1 as well as with 
serotype 14 strains. In coagglutination tests type 1 strains 

30 react with the anti-type 1 as well as with the anti-type 14 
antisera (56) . This suggests the presence of common epitopes 
between these serotypes. On the other hand type 1 strains 
agglutinated only with anti-type 1 serum (56,57), indicating 
that it is possible to detect differences between those 

35 serotypes. 

The cps2F, cps2G, cps2H, cps2I and cps2J probes hybridized 
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with serotypes 2 and 1/2 only. Serotype 34 showed a weak 
hybridizing signal with the cps2G probe. As shown in 
agglutination tests type 1/2 strains react with sera directed 
against type 1 as well as with sera directed against type 2 

5 strains (46) . Therefore, type 1/2 shared antigens with both 
types 1 and 2. Based on the hybridization patterns of serotype 
1/2 strains with the cpsl and cps2 specific genes, serotype 
1/2 seemed to be more closely related to type 2 strains than 
to type 1 strains. In our current studies we identify type- 

10 specific genes, primers or probes which are used for the 

discrimination of serotypes 1, 14 and 2 and 1/2 and others of 
the 35 serotypes yet known. Furthermore, type-specific genes, 
primers or probes can now easily be developed for yet unknown 
serotypes, once they become isolated. 

15 Cloning and characterization of a further part of 

the cps2 locus. 

Based on the established sequence 11 genes, designated 
cps2L to cps2T, orf2U and orf2V , were identified. A gene 
homologous to genes involved in the polymerization of the 

20 repeating oligosaccharide unit (cps20) as well as genes 

involved in the synthesis of sialic acid (cps2P to cps2T) were 
identified. Moreover, hybridization experiments showed that 
the genes involved in the sialic acid synthesis are present in 
S. suis serotype 1, 2, 14, 27 and 1/2. The "cps2M" and n cps2N M 

25 regions showed similarity to proteins involved in the 

polysaccharide biosynthesis of other gram-positive bacteria. 
However, these regions seemed to be truncated or were non- 
functional as the result of frame-shift or point mutations. At 
its 3 f -end the cps2 locus contained two insertional elements 

30 ("orf2U" and "orf2V") both of which seemed to be non- 
functional . 

To clone the remaining part of the cps2 locus, sequences 
of the 3 ! -end of pCPS26 (Fig. 1C) were used to identify a 
chromosomal fragment containing cps2 sequences located further 
35 downstream. This fragment was cloned in pKUN19 resulting in 
pCPS29. Using a similar approach we subsequently isolated the 
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plasmids pCPS30 and pCPS34 containing downstream cps2 
sequences (Fig. 1C) . 

Analysis of the cps2 operon. 
5 The complete nucleotide sequence of the cloned fragments 

was determined. Examination of the compiled sequence revealed 
the presence of : a sequence encoding the C-terminal part of 
Cps2K, six apparently functional genes (designated cps20- 
cps2T ) and the remnants of 5 different ancestral genes 

10 (designated M cps2L f \ n cps2M", "cps2N n , "orf2U" and "orf2V") . 
The latter genes seemed to be truncated or incomplete as the 
result of the presence of stop codons or frame-shift mutations 
(Fig. 1A) . Neither potential promoter sequences nor potential 
stem-loop structures could be identified within the sequenced 

15 region. A ribosome-binding site precedes each ORF and the 

majority of the ORFs is very closely linked. Three intergenic 
gaps were found: one between "cps2M" and "cps2N" (176 
nucleotides), one between cps20 and cps2P (525 nucleotides), 
and one between cps2T and. ,! orf2U n (200 nucleotides). These and 

20 our above data show that 0rf2X and Cps2A-0rf2T are part of a 
single operon. 

A list of all loci and their properties is shown in Table 
4. The "cps2L" region contained three potential ORFs, of 103, 
79 and 152 amino acids, respectively, which were only 

25 separated from each other by stop codons. Only the first ORF 
is preceded by a potential ribosomal binding site and 
contained a methionine start codon. This suggests that "cps2L" 
originates from an ancestral cps2L gene, which coded for a 
protein of 339 amino acids. The function of this hypothetical 

30 Cps2L protein remains unclear so far: no significant 

homologies were found between Cps2L and proteins present in 
the data libraries. It is not clear whether the first ORF of 
the "cps2L H region is expressed into a protein of 103 amino 
acids. The"cps2M " region showed homology to the N-terminal 

35 134 amino acids of the NeuA proteins of Streptococcus 

agalactiae and Escherichia coli (AB017355, 32) . However, 
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although the "cps2 M" region contained a potential ribosome 
binding site, a methionine start codon was absent. Compared 
with the S. agalactiae sequence, the ATG start codon was 
replaced by a lysin encoding AAG codon. Moreover, the region 
5 homologous to the first 58 amino acids of the S. agalactiae 
NeuA (identity 77%) was separated from the region homologous 
to amino acids 59-134 of NeuA by a repeated DNA sequence of 
100-bp (see later) . In addition, the region homologous to 
amino acids 59 to 95 of NeuA (identity 32%) and the region 
. 10 homologous to the amino acids 96 to 134 of NeuA (identity 

50%) were present in different reading frames. Therefore, the 
partial and truncated NeuA homologue is probably nonfunctional 
in S. suis. The "cps2N" region showed homology to CpsJ of S. 
agalactiae (accession no. AB017355) . However, sequences 

15 homologous to the first 88 amino acids of CpsJ were lacking in 
S. suis. Moreover, the homologous region was present in two 
different reading frames. The protein encoded by the cps20 
gene showed homology to proteins of several streptococci 
involved in the transport of the oligosaccharide repeating 

20 unit (accession no. AB017355), suggesting a similar function 
for Cps20. The proteins encoded by the cps2P, cps2S and cps2T 
genes showed homology to the NeuB, NeuD and NeuA proteins of 
S. agalactiae and E. coli (accession no AB017355) . Because the 
"cps2M" region also showed homology to NeuA of E. coli, the 

25 S. suis cps2 locus contains a functional neuA gene (cps2T) as 
well as a nonfunctional ("cps2M") gene. The mutual homology 
between these two regions showed an identity of 77% at the 
amino acid level over amino acids 1-58 and 49% over the amino 
acids 59-134. Cps2Q and Cps2R showed homology to the N- 

30 terminal and C-terminal parts of the NeuC protein of S. 

agalactiae and E. coli, respectively. This suggests that the 
function of the S. agalactiae NeuC protein in S. suis is 
likely fulfilled by two different proteins. In E. coli the 
neu genes are known to be involved in the synthesis of sialic 

35 acid. NeuNAc is synthesized from N-acetylmannosamine and 

phosphoenolpyruvate by NeuNAc synthetase. Subsequently, NeuNAc 
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is converted to CMP-NeuNAc by the enzyme CMP-NeuNAc 
synthetase. CMP-NeuNAc is the substrate for the synthesis of 
polysaccharide. In E. coli Kl NeuB is the NeuNAc synthetase, 
NeuA is the CMP-NeuNAc synthetase. NeuC has been implicated in 

5 the NeuNAc synthesis, but its precise role is not known. The 
precise role of NeuD is not known. A role of the Cps2P-Cps2T 
proteins in the synthesis of sialic acid can easily be 
envisaged, since the capsule of S. suis serotype 2 is rich in 
sialic acid. In S. agalactiae sialic acid has been shown to be 

10 critical to the virulence function of the type III capsule. 
Moreover, it has been suggested that the presence of sialic 
acid in capsule of bacteria which can cause meningitis may be 
important for the capacity of these bacteria to breach the 
blood-brain barrier. So far, however, the requirement of the 

15 sialic acid for virulence of S. suis remains unclear. 

"Orf2U" and "Orf2V n showed homology to proteins located on 
two different insertional elements. "Orf2U" is homologous to 
IS1194 of Streptococcus thermophilus, whereas "0rf2V" showed 
homology to a putative transposase of Streptococcus 

20 pneumoniae. This putative transposase was recently found to be 
associated with the type 2 capsular locus of S. pneumoniae. 
Compared with the original insertional elements in S. 
thermophilus and S. pneumoniae, both "Orf2U" and "Orf2V" are 
likely to be non-functional due to frame shift mutations 

25 within their coding regions. 

A striking observation was the presence of a sequence of 
100 bp (Fig. 9) which was repeated three times within the cps2 
operon. The sequence is highly conserved (between 94% and 98% 
) and was found in the intergenic regions between cps2G and 

30 cps2H, within "cps2M" and between cps20 and cps2P. No 

significant homologies were found between this 100-bp direct 
repeat sequence and sequences present in the data libraries, 
suggesting that the sequence is unique for S. suis. 

Distribution of the cps2 sequences among the 35 S. suis 

35 serotypes. To examine the presence of sialic acid encoding 
genes in other S. suis serotypes, we performed cross- 
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hybridization experiments. DNA fragments of the individual 
cps2 genes were amplified by PCR, radiolabelled with 32P and 
hybridized to chromosomal DNA of the reference strains of the 
35 different S. suis serotypes. As a positive control we used 

5 a probe- specific for S. suis 16S rRNA. The 16S rRNA probe 
hybridized with almost equal intensities to all serotypes 
tested (Table 4). The "cps2L" sequence hybridized with DNA of 
serotype 1, 2, 14 and 1/2. The fI cps2M", cps20, cps2P, cps2Q, 
cps2R, cps2S and cps2T genes hybridized with DNA of serotype 

10 1, 2, 14, 27 and 1/2. Because the cps2P-cps2T genes are most 
probably involved in the synthesis of sialic acid these 
results suggest that sialic acid is also a part of the capsule 
in the S. suis serotype 1, 2, 14, 27 and 1/2. This is in 
agreement with the finding that the serotypes 1, 2 and 1/2 

15 possess a capsule that is rich in sialic acid. Although the 
chemical compositions of the capsules of serotype 14 and 27 
are unknown, recent agglutination studies using sialic acid- 
binding lectins suggested the presence of sialic acid in S. 
suis serotype 14, but not in serotype 27. In these studies, 

20 sialic acid was also detected in serotypes 15 and 16. Since 
the latter observation is not in agreement with our 
hybridization studies, it might be that other genes, not 
homologous to the cps2P-cps2T genes, are responsible for the 
sialic acid synthesis in serotypes 15 and 16. 

25 A probe based on"cps2N" sequences hybridized with DNA from 

serotypes 1, 2, 14 and 1/2. A probe specific for "orf2U" 
hybridized with serotypes 1, 2, 7, 14, 24, 27, 32, 34, and 
1/2, whereas a probe specific for "orf2V" hybridized with many 
different serotypes. In addition, we prepared a probe specific 

30 for the 100-bp direct repeat sequence. This probe hybridized 
with the serotypes 1, 2, 13, 14, 22, 24, 27, 29, 32, 34 and 
1/2 (Table 4). To analyze the number of copies of the direct 
repeat sequence within the S. suis serotype 2 chromosome, a 
Southern blot hybridization and analysis was performed. 

35 Therefore, chromosomal DNA of S. suis serotype 2 was digested 
with Ncol and hybridized with a 32P-labelled direct repeat 
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sequence. Only one hybridizing fragment, containing the three 
direct repeats present on the cps2 locus, was found (results 
not shown) . This indicates that the 100-bp direct repeat 
sequence is only associated with the cps2 locus. In S. 

5 pneumoniae a 115-bp long repeated sequence was found to be 
associated with the capsular genes of serotypes 1, 3, 14 and 
19F. In S. pneumoniae this 115-bp sequence was also found in 
the vicinity of other genes involved in pneumococcal virulence 
.{hyaluronidase and neuraminidase genes) . A regulatory role of 

10 the 115-bp sequence in co-ordinate control of these virulence- 
related genes was suggested. 

To study the role of the capsule in resistance to 
phagocytosis and in virulence, we constructed two isogenic 
mutants in which capsule synthesis was disturbed. In lOcpsB, 

15 the cps2B gene was disturbed by the insertion of an 

antibiotic-resistance gene, whereas in lOcpsEF parts of the 
cps2E and cps2F genes were replaced. Both mutant strains 
seemed to be completely unencapsulated. Because the cps 2 
genes seemed to be part of an operon polar effects cannot be 

20 excluded. Therefore these data did not give any information 
about the role of Cps2B, Cps2E or Cps2F in the polysaccharide 
biosynthesis. However, the results clearly show that the 
capsular polysaccharide of S. suis type 2 is a surface 
component with antiphagocytic activity. In vitro wild type 

25 encapsulated bacteria are ingested by phagocytes at a very low 
frequency, whereas the mutant unencapsulated bacteria are 
efficiently ingested by porcine macrophages. Within 2 hours, 
over 99.6% of mutant bacteria were ingested and over 92% of 
the ingested bacteria were killed. Intracellularly , wild type 

30 as well as mutant strains seemed to be killed with the same 
efficiency. This suggests that the loss of capsular material 
is associated with loss of capacity to resist uptake by 
macrophages. This loss of resistance to in vitro phagocytosis 
was associated with a substantial attenuation of the virulence 

35 in germfree pigs. All pigs inoculated with the mutant strains 
survived the experiment and did not show any specific clinical 
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signs of disease. Only some aspecific clinical signs of 
disease could be observed. Moreover, mutant bacteria could be 
reisolated from the pigs. This supports the idea that, as in 
other pathogenic Streptococci, the capsule of S. suis acts as 

5 an important virulence factor. Transposon mutants prepared by 
Charland impaired in the capsule production showed a reduced 
virulence in pigs and mice. To construct these mutants the 
type 2 reference strain S735 was used. We previously showed 
that this strain is only weakly virulent for young pigs. 

10 Moreover, the insertion site of the transposon is unsolved 
sof ar . 

As a further example herein a rapid PCT test for Streptococcus 
suis type 1 is described. 

15 

Recent epidemiological studies on Streptococcus suis 
infections in pigs indicated that, besides serotypes 1, 2 and 
9, serotype 7 is also frequently associated with diseased 
animals. For the latter serotype, however, no rapid and 

20 sensitive. diagnostic methods are available. This hampers 
prevention and control programs. Here we describe the 
development of a type-specific PCR test for the rapid and 
sensitive detection of S. suis serotype 7. The test is based 
on DNA sequences of capsular (cps) genes specific for serotype 

25 7. These sequences could be identified by cross-hybridization 
of several individual cps genes with the chromosomal DNAs of 
35 different S. suis serotypes. 

Streptococcus suis is an important cause of meningitis, 
septicemia, arthritis and sudden death in young pigs [69,70]. 

30 It can, however, also cause meningitis in man [71] . Attempts 
to control the disease are still hampered by the lack of 
sufficient knowledge about the epidemiology of the disease and 
the lack of effective vaccines and sensitive diagnostics. 

S. suis strains can be identified and classified by their 

35 morphological, biochemical and serological characteristics 
[70, 73, 74]. Serological classification is based on the 



49 

WO 00/05378 PCT/N L99/00460 

presence of specific antigenic determinants. Isolated and 
biochemically characterized S. suis cells are agglutinated 
with a panel of specific sera. These typing methods are very 
laborious and time-consuming and can only be performed on 
5 isolated colonies. Moreover, it has been reported that 

nonspecific cross-reactions may occur among different types of 
S. suis [75, 76] . 

So far, 35 different serotypes have been described [7, 78, 
79]. S. suis serotype 2 is the most prevalent type isolated 

10 from diseased pigs, followed by serotypes 9, and 1. However, 
recently serotype 7 strains were also frequently isolated from 
diseased pigs [80, 81, 82]. This suggests that infections 
with S. suis serotype 7 strains seemed to be an increasing 
problem. Moreover, the virulence of S. suis serotype 7 strains 

15 was confirmed by experimental infection of young pigs [83] . 
Recently, rapid and sensitive PCR assays specific for 
serotypes 2 (and 1/2), 1 (and 14) and 9 were developed [84], 
These assays were based the cps loci of S. suis serotypes 2, 
1 and 9 [84, 85] . However, until now no rapid and sensitive 

20 diagnostic test is available for S. suis serotype 7. Herein we 
describe the development of a PCR test for the rapid and 
sensitive detection of S. suis serotype 7 strains. The test is 
based on DNA sequences which form a part of the cps locus of 
S. suis serotype 7. Compared with the serological serotyping 

25 methods the PCR assay was a rapid, reliable and sensitive 

assay. Therefore, this test, in combination with the PCR tests 
which we previously developed for serotype 1, 2 and 9, will 
undoubtedly contribute to a more rapid and reliable diagnosis 
of S. suis and may facilitate control and eradication 

30 programs. 
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Materials and Methods 



Bacterial strains, growth conditions and serotyping. 

The bacterial strains and plasmids used in this study are 

5 listed in Table 7. The S. suis reference strains were obtained 
from M. Gottschalk, Canada. S. suis strains were grown in 
Todd-Hewitt broth (code CM189, Oxoid) , and plated on Columbia 
agar blood base (code CM331, Oxoid) containing 6% (v/v) horse 
blood. E.coli strains were grown in Luria broth [86] and 

10 plated on Luria broth containing 1.5% (w/v) agar. If required, 
ampicillin was added to the plates. The S. suis strains were 
serotyped by the slide agglutination test with serotype- 
specific antibodies [70] . 



15 DNA techniques. 

Routine DNA manipulations and PCR reactions were performed 
as described by Sambrook et al. [88]. Blotting and 
hybridization was performed as described previously [84,86]. 

20 DNA sequence analysis. 

DNA sequences were determined on a 373A DNA Sequencing 
System (Applied Biosystems, Warrington, GB) . Samples were 
prepared by use of a ABI/PRISM dye terminator cycle sequencing 
ready reaction kit (Applied Biosystems) . Custom-made 

25 sequencing primers were purchased from Life Technologies. 
Sequencing data were assembled and analyzed using the 
McMollyTetra program. The BLAST program was used to search for 
protein sequences homologous to the deduced amino acid 
sequences . 

30 

PCR. 

The primers used for the cps7H PCR correspond to the 
positions 3334-3354 and 3585-3565 in the S. suis cps7 locus. 
The sequences were: 
35 5 1 -AGCTCTAACACGAAATAAGGC-3 1 and 5 1 -GTCAAACACCCTGGATAGCCG-3 1 . 

The reaction mixtures contained 10 mM Tris-HCl, pH 8.3; 1.5 mM 
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MgC12; 50 mM KC1; 0.2 mM of each of the four deoxynucleotide 
triphosphates; 1 microM of each of the primers and 1U of 
AmpliTaq Gold DNA polymerase (Per kin Elmer Applied Biosystems, 
New Jersey) . DNA amplification was carried out in a Perkin 
5 Elmer 9600 thermal cycler and the program consisted of an 

incubation for 10 min at 95oC and 30 cycles of 1 min at 95oC, 
2 min at 56oC and 2 min at 72oC. 

Results and discussion 

10 

Cloning of the seroytpe 7-specific cps genes. 

To isolate the type-specific cps genes of S. suis serotype 
7 we used the cps9E gene of serotype 9 as a probe to identify 
chromosomal DNA fragments of type 7 containing homologous DNA 
15 sequences [84]. A 1.6-kb PstI fragment was identified and 
cloned in pKUN19. This yielded pCPS7-l (Fig. 11C) . In turn, 
this fragment was used as a probe to identify an overlapping 
2.7 kb Scal-Clal fragment. pGEM7 containing the latter 
fragment was designated pCPS7-2 (Fig. 11C) . 

20 

Analysis of the cloned cps7 genes. 

The complete nucleotide sequences of the inserts of pCPS7- 
1, pCPS7-2 were determined. Examination of the cps7 sequence 
revealed the presence of two complete and two incomplete open 

25 reading frames (ORFs) (Fig.llC). All ORFs are preceded by a 
ribosome-binding site. In accord with the data obtained for 
the cpsl, cps2 and cps9 genes of serotypes 1, 2 and 9, 
respectively, the type 7 ORFs are very closely linked to each 
other. The only significant intergenic gap was that found 

30 between cps7E and cps7F (443 nucleotides) . No obvious promoter 
sequences or potential stem-loop structures were found in this 
region. This suggests that, as in serotype 1, 2 and 9, the cps 
genes in serotype 7 form part of an operon. 

An overview of the ORFs and their properties is shown in 

35 Table 8. As expected on the basis of the hybridization data 
[84] , the Cps9E and Cps7E proteins showed a high similarity 
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(identity 99% , Table 8). Based on sequence comparisons between 
Cps9E and Cps7E, the PstI fragment of pCPS7-l lacks the region 
encoding the first 371 codons of Cps7E. The C-terminal part of 
the protein encoded by the cps7F gene showed some similarity 
5 with the BplG protein of Bordetella pertussis [88], as well 
as with the C-terminal part of S. suis Cps2E [85]. Both BplG 
and Cps2E were suggested to have glycosyltransf erase activity 
and are probably involved in the linkage of the first sugar to 
the lipid carrier [85,88]. The protein encoded by the cps7G 

10 gene showed similarity with the BlpF protein of Bordetella 
pertussis [88]. BplF is likely to be involved in the 
biosynthesis of an amino sugar, suggesting a similar function 
for Cps7G. The protein encoded by the cps7H gene showed 
similarity with the WbdN protein of E. coli [89] as well as 

15 with the N-terminal part of the Cps2K protein of S. suis [81]. 
Both WbdN and Cps2K were suggested to have glycosyltransf erase 
activity [85, 89] . 

Serotype 7 specific cps genes. 

20 To determine whether the cloned fragments in pCPS7-l and 

pCPS7-2 contained serotype 7-specific DNA sequences, cross 
hybridization experiments were performed. DNA fragments of the 
individual cps7 genes were amplified by PCR, labelled with 
32P, and used to probe spot blots of chromosomal DNA of the 

25 reference strains of 35 different S. suis serotypes. The 

results are summarized in Table 9. As expected, based on the 
data obtained with the cps9E probe [84], the cps7E probe 
hybridized with chromosomal DNA of many different S. suis 
serotypes. The cps7F and cps7G probes showed hybridization 

30 with chromosomal DNA of S. suis serotypes 4, 5, 7, 17, and 23. 
However, the cps7H probe hybridized with chromosomal DNA of 
serotype 7 only, indicating that this gene is specific for 
serotype 7. 



53 

WO 00/05378 PCI7NL99/00460 

Type specific PCR. 

We tested whether we could use PCR instead of hybridization 
for the typing of the S. suis serotype 7 strains. For that 
purpose we selected an oligonucleotide primer set within the 
5 cps7H gene with which an amplified fragment of 251-bp was 
expected. In addition, we included in our analysis several S. 
suis serotype 7 strains, other than the reference strain. 
These strains were obtained from different countries and were 
isolated from different organs (Table 7) . The results show 

10 that indeed a fragment of about 250-bp was amplified with all 
type 7 strains used (Fig. 12B) , whereas no PCR products were 
obtained with serotype 1, 2 and 9 strains (Fig. 12A) . This 
suggests that the PCR test, as described here, is a rapid 
diagnostic tool for the identification of S, suis serotype 7 

15 strains. Until now such a diagnostic test was not available 
for serotype 7 strains. Together with the recently developed 
PCR assays for serotype 1, 2, 1/2, 14 and 9, this assay may be 
an important diagnostic tool to detect pigs carrying serotype 
2, 1/2, 1, 14 ,9 and 7 strains and may facilitate control and 

20 eradication programs. 
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TABLE 1. Bacterial strains and plasmids 



PCT/NL99/00460 



strain/plasmid 



relevant 
characteristics 



source /reference 



Strain 

E.coli 
CC118 
XL2 blue 



PhoA" 

Stratagene 



(28) 



E. coli 
XL2 blue 



Stratagene 



S. suis 

10 

3 

17 

735 
T15 



virulent serotype 2 strain 
serotype 2 
serotype 2 

reference strain serotype 2 
serotype 2 



(49) 
(63) 
(63) 
<63) 
(63) 



6555 
6388 
6290 
5637 



reference strain serotype 1 
serotype 1 
serotype 1 
serotype 1 



(63) 
(63) 
(63) 
(63) 



5673 
5679 
5928 
5934 
5209 

5218 
5973 
6437 
6207 



serotype 1/2 
serotype 1/2 
serotype 1/2 
serotype 1/2 

reference strains serotype 1/2 

reference strain serotype 9 
serotype 9 
serotype 9 
serotype 9 



(63) 
(63) 
(63) 
(63) 
(63) 

(63) 
(63) 
(63) 
(63) 



reference strains 



serotypes 1-34 



(9, 56, 14) 



S. suis 
10 

lOcpSB 



virulent serotype 2 strain 
isogenic cpsB mutant of strain 10 



(51) 

this work 



lOcpsEF 



isogenic cpsEF mutant of strain 10 



this work 



Plasmid 

pKUN19 

pGEM72f (+) 

pIC19R 

pIC20R 

pIC-spc 



replication functions pUC, Amp* 
replication functions pUC, Amp* 
replication functions pUC, Amp R 
replication functions pUC, Amp R 
pIC19R containing spc R gene of pDL282 



(23) 
Promega Corp. 
(29) 
(29) 

labcollection 
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pDL282 

pPHOS2 

pPH07 

PPHOS7 

pCPS6 

pCPS7 

pCPSll 

pCPS17 

pCPS18 

pCPS20 

pCPS23 

pCPS25 

pCPS26 

pCPS27 

pCPS28 

pCPS29 

pCPSl-1 

pCPSl-2 

pCPS9-l 

pCPS9-2 



PCT/NL99/00460 

replication functions of pBR322 and 

pVT736-l, Amp*, Spc R (43) 
pIC-spc containing the truncated phoA gene this work 

of pPH07 as a Pstl-BamHI fragment 

contains truncated phoA gene (15) 
pPHOS2 containing chromosomal S. suis DNA this work 

pKUN19 containing 6 kb Hindlll fragment this work (Fig.l) 

of cps ope r on 

pKUN19 containing 3,5 kb £coRI-HindlII fragment this work (Fig.l) 
of cps operon 

pCPS7 in which 0.4 kb Pstl-BamHl fragment this work (Fig.l) 

of cpsB gene is replaced by Spc R gene of pIC-spc 

pKUN19 containing 3.1 kb Kpnl fragment this work (Fig.l) 

of cps operon 

pKUN19 containing 1 . 8 kb SnaBI fragment this work (Fig.l) 

of cps operon 

pKUN19 containing 3 . 3 kb Xbal-Hindlll this work (Fig.l) 

fragment of cps operon 

pGEM7Zf(+) containing 1 . 5 kb Nlul fragment this work (Fig.l) 

of cps operon 

pIC20R containing 2.5 kb KpnI-Sall fragment this work (Fig.l) 

of pCPS17 

pKUN19 containing 3.0 kb Hindlll fragment this work (Fig.l) 

of cps operon 

pCPS25 containing 2 . 3 kb Xibal (blunt) -Cial this work (Fig.l) 

fragment of pCPS20 

pCPS27 containing the 1.2 kb Pstl-Xhol Spc R this work (Fig.l) 

gene of pIC-spc 

pKUN19 containing 2 . 2 kb Sacl-Ps tl fragment this work (Fig.l) 

of cps operon 

PKUN19 containing 5 kb EcoRV fragment this work (Fig.l) 

of cps operon of type 1 

pKUN19 containing 2 . 2 kb Hindi II fragment this work (Fig.l) 

of cps operon of type 1 

pKUN19 containing 1 kb Hindlll-Xbal this work (Fig.l) 

fragment of cps operon of serotype 9 

pKUNl9 containing 4 . 0 kb Xbal-Xbal this work (Fig.l) 

fragment of cps operon of serotype 9 



Amp B : ampicillin resistant 
Spc R : spectinomycin resistant 
cps: capsular polysaccharide 
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Figure 1. 

Organization of the cps2 gene cluster of S. suis type 2. 

5 (A) Genetic map of the cps2 gene cluster. The shadowed arrows 
represent potential ORFs. Interrupted ORFs indicate the 
presence of stop codons or frame-shift mutations. Gene 
designations are indicated below the ORFs. The closed arrows 
indicate the position of the potential promoter sequences. I 

10 indicates the position of the potential transcription 

regulator sequence. I I I indicates the position of the 100-bp 
repeated sequence. 

(B) Physical map of the cps2 locus. 

Restriction sites are as follows: A: Alul; C: Clal; E, EcoRI; 
15 H, Hindlll; K, Kpnl; M, Mlul; N, Nsil; P, PstI; S, SnaBI; Sa: 
Sad; X, Xbal. 

(C) The DNA fragments cloned in the various plasmids. 
Figure 2 

20 Ethidium bromide stained agarose gel showing. PCR products 

obtained with chromosomal DNA of S.suis strains belonging to 
the serotypes 1,2, H, 9 and 14 and cps2J, cpsll and cps9H 
primer sets as described in Materials and Methods. (A) cpsll 
primers . 

25 (B) cps2J primers and (C) cps9H primers. Lanes 1-3: serotype 1 
strains; lanes 4-6: serotype 2 strains; lanes 7-9: serotype H 
strains; lanes 10-12: serotype 9 strains and lanes 13-15: 
serotype 14 strains . 

(B) Ethidium bromide stained agarose gel showing PCR products 
30 obtained with tonsillar swabs collected from pigs carrying 
S.suis type 2, type 1 or type 9 strains and cps2j, cpsll and 
cpsH primer sets as described in Materials and Methods. 
Bacterial DNA suitable for PCR was prepared by using the 
multiscreen methods as described previously (20) . (A) cpsll 
35 primers. (B) cps2J primers and (C) cps9H primers. Lanes 1-3: 
PCR products obtained with tonsillar swabs collected from pigs 
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carrying S.suis type 1 strains; lanes 4-6: PCR products 
obtained with tonsillar swabs collected from pigs carrying 
S.suis type 2 strains; lanes 7-9: PCR procucts obtained with 
tonsillar swabs collected from pigs carrying S.suis type 9 
strains; lanes 10-12: PCR products obtained with chromosomal 
DNA from serotype 9, 2 and 1 strains respectively; lane 13: 
negative control, no DNA present. 

Figure 3 

CPS2 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 

Figure 4 

CPS1 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 

Figure 5 

CPS9 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 



Figure 6 

CPS7 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 

Figure 7 

Alignments of the N-terminal parts of Cps2J and Cps2K. 
Identical amino acids are marked by bars. The amino acids 
shown in bold are also conserved in Cpsl4I, Cpsl4J of S. 
pneumoniae and several other glycosyltransf erases (19) . The 
aspartate residues marked by asterics are strongly conserved. 

Figure 8 

Transmission electron micrographs of thin sections of various 
S. suis strains. 
(A) wild type strain 10; 
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(B) mutant strain lOcpsB; 

(C) mutant strain lOcpsEF. 
Bar = 100 nm 

5 Figure 9 

(A) Kinetics of phagocytosis of wild type and mutant S. suis 
strains by porcine alveolair macrophages. Phagocytosis was 
determined as described in Materials and Methods. The Y-axis 

represents the number of CFU per milliliter in the supernatant 

10 fluids as determined by plate counting, the X-axis represents 
time in minutes. 

□ wild type strain 10; 
o mutant strain lOcpsB; 
A mutant strain lOcpsEF. 

15 

(B) Kinetics of intracellular killing of wild type and mutant 
S. suis strains by porcine AM. The intracellular killing was 
determined as described in Material and Methods. The Y-axis 
represents the number of CFU per ml in the supernatant fluids 

20 after lysis of the macrophages as determined by plate 
counting, the X-axis represents time in minutes. 

□ wild type strain 10; 
o mutant strain lOcpsB; 
A mutant strain lOcpsEF. 

25 

Figure 10 

Nucleotide sequence alignment of the highly conserved 100-bp 
repeated element. 

1) 100-bp repeat between cps2G and cps2H 
30 2) 100-bp repeat within "cps2M" 

3) 100-bp repeat between cps20 and cps2P 



Figure 11. The cps2, cps9 and cps7 gene clusters of S. suis 
35 serotypes 2, 9 and 7. 
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(A) Genetic organization of the cps2 gene cluster [84], The 
large arrows represent potential ORFs. Gene designations are 
indicated below the ORFs. Identically filled arrows represent 
ORFs which showed homology. The small closed arrows indicate 

5 the position of the potential promoter sequences. I indicates 
the position of the potential transcription regulator 
sequence. 

(B) Physical map and genetic organization of the cps9 gene 
cluster [15]. Restriction sites are as follows: B: BamHI; P: 

10 PstI; H: Hindlll; X:XbaI. The DNA fragments cloned in the 
various plasmids are indicated. The open arrows represent 
potential ORFs. 

(C) Physical map and genetic organization of the cps7gene 
cluster. Restriction sites are as follows: C: Clal; P: PstI; 

15 Sc: Seal. The DNA fragments cloned in the various plasmids are 
indicated. The open arrows represent potential ORFs. 

Figure 12 (A) Ethidium bromide stained agarose gel showing PCR 
products obtained with chromosomal DNA of S. suis strains 
20 belonging to the serotypes I, 2, 9 and 7 and the cps7H primer 
set. Strain designations are indicated above the lanes. C: 
negative control, no DNA present. M: molecular size marker 
(lambda digested with EcoRI and Hindlll) . 

(B) Ethidium bromide stained agarose gel showing PCR products 
25 obtained with serotype 7 strains collected in different 

countries and from different organs. Bacterial DNA suitable 
for PCR was prepared by using the multiscreen method as 
described previously [89] . Strain designations are indicated 
above the lanes. M: molecular size marker (lambda digested 
30 with EcoRI and Hindlll). 
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CLAIMS 

1. An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis or a gene or gene fragment 
derived thereof. 

2. A nucleic acid according to claim 1 encoding a 

5 Streptococcus suis serotype-specif ic central region, 

preferably encoding at least one enzyme or fragment thereof 
involved in polysaccharide biosynthesis. 

3. A nucleic acid according to claim 1 or 2 hybridising to a 
nucleic acid encoding a gene derived from a Streptococcus suis 

10 serotype 1, 2 or 9 capsular gene cluster. 

4. An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis serotype 2 or a gene or 
gene fragment derived thereof, preferably as identified in 
Figure 3 . 

15 5. An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis serotype 1 or a gene or 
gene fragment derived thereof, preferably as identified in 
Figure 4 . 

6. An isolated or recombinant nucleic acid encoding a capsular 
20 gene cluster of Streptococcus suis serotype 9 or a gene or 

gene fragment derived thereof, preferably as identified in 
Figure 5. 

7. A nucleic acid probe or primer derived from a nucleic acid 
according to anyone of claims 1 to 6 allowing species or 

25 serotype specific detection of Streptococcus suis. 

8. A probe or primer according to claim 7 provided with at 
least one reporter molecule. 

9. A diagnostic test comprising a probe or primer according to 
claim 7 or 8 . 

30 10. A protein or fragment thereof encoded by a nucleic acid 
according to anyone of claims 1 to 6. 

11. A protein or fragment according to claim 10 capable of 
polysaccharide biosynthesis. 
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12. A method to produce a Streptococcus suis capsular antigen 
comprising using a protein or fragment according to claim 11. 

13. A Streptococcus suis capsular antigen obtainable by a 
method according to claim 12. 

5 14. A vaccine comprising an antigen according to claim 13 and 
further comprising a suitable carrier or adjuvant. 

15. A recombinant Streptococcus suis mutant provided with a 
modified capsular gene cluster. 

16. A recombinant micro-organism comprising at least a part of 
10 a capsular gene cluster of Streptococcus suis. 

17. A recombinant micro-organism according to claim 16 
comprising a lactic acid bacterium. 

18. A vaccine comprising a mutant according to claim 15 or a 
micro-organism according to claim 16 or 17. 

15 19. A vaccine according to claim 18 comprising a Streptococcus 
mutant deficient in capsular expression. 

20. A vaccine according to claim 19 wherein said Streptococcus 

mutant has been derived by recombinant techniques, preferably 

through homologous recombination. 
20 21. A vaccine according to claim 19 or 20 wherein said mutant 

is capable of surviving in an immune-competent host. 

22. A vaccine according to claim 21 wherein said mutant is 

capable of surviving at least 4-5 days, preferably at least 8- 

10 days, in said host. 
25 23. A vaccine according to any of claims 19 to 22 comprising a 

mutant capable of expressing a Streptococcus virulence factor 

or antigenic determinant. 

24. A vaccine according to any of claims 19 to 23 comprising a 
mutant capable of expressing a non-Streptococcus protein. 
30 25. A vaccine according to claim 24 wherein said non- 
Streptococcus protein has been derived from a pathogen. 

26. A method for controlling or eradicating a Streptococcal 
disease in a population comprising vaccinating subjects in 
said population with a vaccine according to anyone of claims 

35 18 to 25. 

27. A method for controlling or eradicating a Streptococcal 
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disease comprising testing a sample collected from at least 

one subject in a population partly or wholly vaccinated with a 
vaccine according to anyone of claims 19 to 25 for the 
presence of encapsulated Streptococcal strains. 

5 28. A method for controlling or eradicating a Streptococcal 
disease comprising testing a sample collected from at least 
one subject in a population partly or wholly vaccinated with a 
vaccine according to anyone of claims 19 to 25 for the 
presence of capsule-specific antibodies directed against 

10 Streptococcal strains. 

29. A method for controlling or eradicating a Streptococcal 
disease in a population comprising selecting subjects in said 
population vaccinated with a vaccine according to anyone of 
claims 19 to 25 and testing a sample collected from at least 

15 one subject in said population for the presence of 

encapsulated Streptococcal strains and/or for the presence of 
capsule-specific antibodies directed against Streptococcal 
strains . 
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AAGCTTGGAT ATTGATCACA 
GGCGTGCCCA AGTCCGCAGG 
ACATTATCGT TGTGACGATT 
CTAGGGATAT GTATATCGAA 
GATAGTTTGT CAGCCAGTGG 
AGTGCAGGAT TAGATTTTCC 
GGAACACAGT AAGCTCCTCT 
ACTGAGCAAA TTGGTAGGCA 
TTGGTGAGGC AAGTGCTGAA 
AATCTGTGAC AGCAGCCTTT 
GGTCGAATTG TTATGGCCCA 
GTAAAAGCAA GTTTTCCAAC 
TCTATGCAGT TTTTATGCTG 
ATTCACAGAG TAATAATTTT 
CCTCTTCTTC TAAGTTCGAG 
AATATCTTAC GAATTGCTCC 
ATCTGTAAGA AATCAGCTTT 
ATGCTAGGAG AAAGAATCCC 
TGGAATTCGA TACGGGATGT 
TTGAGCGTGA TAAATGTGAT 
CGTTATCAAT GTAGAGCGAG 
GACTCTTATG TTTGATGAAG 
GCACCGATTC GGAGGGCAGG 
TTATCTGTAT CAAGATAGTG 
TGCTAAATAG TCATCCTCAA 
TGTTTTTGTT GCTATATCAT 
GAATTGTGTA GAAAAACTTA 
CAATTCCATC TAAATTCCGT 
CGAATGAGCT CTATCATTCG 
GCCAAGGTTT CCATTTGTGT 
TGTGATAACC AGCTGGTCTT 
GGAGGAAATC AATTCTGCCA 
GGTAATTTTC CCGCCCAATA 
TTCTTGAAAG TCTGTAGGAT 
CTATCCTCTA AGATATAATA 
AATTCCAACA TAGCCTTTTG 
GCGGAGTTGA CGGATAGAAG 
AGTCAAAATA TCTTGGATGA 
TTTTTATAGA CTATGTTACT 
ATAATGGGGT TGAGGTTCAG 
GATGAAAATA ATTATACCTA 
TTATCTCCTG TCTGATCGAA 
TTGATGTAAA AAAGATGGCT 
CTGACCGTTG GTATCGAATC 
TGGCAGTTAT ATGATGGTCT 
GAAGAAAATT ATTTACGTGA 
ATTGATTCAT CCTTTTGAAT 
GATAGGCAAT CAGTCTTTGA 
AAGTTGGTGA TGATGAACTG 
CTCCCCAGAT T C AG AAAAG A 
GCAGGTCAGC TAAAAGTTCA 
TGGTTGGCTA AGAACAATAT 
GGTGGATGGC TTTGAATATT 
ATCAATAAAA ATGTGAAATT 
AAGGGTAGAA AAATATTAAT 
AAGGTTGTTT ATGAAAAAGA 
AATTGGTAAA TTTTGCGCTT 
CCATGTATCG CTATAACATC 
ACGCTTTTGC TAGTAGGAGT 
CGCATATTTA CAGCGCTCTT 
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TGATGGAGGT GATGGAAGCA 
CTTATCAGGC AGCTTTTGAG 
ACAGGTGGGC TATCGGGTAG 
GAGCATCCGA ATGTCAATAT 
GGAAATGGAT TTACTTGTAC 
ACAAGTAGTA GAAGCGATAA 
TTGTTTTAGC GAAAGTTGAT 
CTGTCGTTGG TCTTCTCAAT 
GGAAAATTAG AGTTGCTTCA 
GAAGAAATGA AAAAAGCAGG 
CCGCAACAAT GCTAAGTTCT 
GGCTGTTATT GACGAAGTTG 
AAGAAGGTGG ACTTTTGATG 
GGGCTGTAAT TTCCGCTATA 
GGGGATTGTT TGTATGAGAC 
AGTTTATCTG CAAAATCTTG 
CTGTCCGCTG AAATAATAAC 
CTTGCTTAGC TGAAAGGTCA 
TTAAAGCGTA TTTCTCTAGA 
GAAGATGCTG TGTGTTCCGC 
AGAGCTTTTT GCATGATAAG 
ATATCACGTA GCTGATTAGG 
AAAGAGTGTC GGTGTAAAAG 
TAAAGGTAGG CTATGACTAG 
TGATGTAGAC ATCGTATTGC 
AGGTTGAACC GAGAGGGTTG 
ATTTTTCCAG TTTGGAAGAT 
TCAATTGTTT GATAGGGGAT 
TGAATAGGTA GGGTTCTCTA 
GAGAATATAT AGAGCTTGTT 
TTTTTGTATA GACATGATAG 
ATCCCTCTTG CTGGTGATAG 
AGACTTTCTT TTAGACAAAT 
TGAGCTCTAC AGGTATGGTC 
ACCGCTTTTT TCGACAGCGT 
GACAGTGTCT TTGCTACAAT 
GTAATTTCTC TCCACGTTTG 
TAACTTGATA TTTTTTCATC 
AGCTAGTATA TAGAAAAAAT 
GAATTAAGCT ACTCTATGGT 
ATGCAAAAGA AGTAAATACA 
GCAAGCCGGT GCTGGATGCC 
GCCTTTTATA AATTGAATGA 
AGGACAGGTC AAGCAAAAAC 
CATGTATCGT TATATGGATA 
CCACGTTCGT GTAGCGACAG 
TCATTTCACC TCACCGCTTA 
AACAGTACTG GCGACCGTAT 
ATTCTCTCAC TGGCTTCGTC 
TTAGTTAAAA TTCTTTTCAT 
CTCGACTATA TCAAAAAAAG 
TCAGGAATTA TCGGACATTC 
GTACTTCCGA ATCAACGGCA 
ATGAAAAAGA TAACGTTTTC 
TTCTATGATA TAATGGATGC 
GAAGCGGACG AAGTAAGTCG 
TTGGGACTTT ATTCCATTAC 
CTAGATTTCC GGTATTTAAA 
GGCAGTATTG GCTGGATTAT 
ACTTGTTTTT TCACTGGTCA 



TCTAAGTCTG CAGCGGGGTC 
GGAGCTGAGA 

TTTTAATGCG GCACGTGTAG 
CCATTTGATA 

ACCAAATCAA TCGCTTAATT 
CTCACTATCG 

AATCTTGTTA AGAATGGAAG 
ATCCGTATGG 

AAAGGCGCGT GGTCATAAGA 
CTATGATGGT 

TCCAACAATT CTCAGAGTTG 
CAACATCAGG 

GGCTACGAAG TGAAAGCGTG 
GAATAATCCC 

TATTGGATTT CATTCATTCA 
TTCAAAGAAG 

ATTTTCCAAA CATGTGTTGG 
CGCTCCCCTT 

CAGTCTTTTA TTTTATTCCA 
GCAAACATAC 

ATTGGTATCG TAGTCGATTA 
AAGGCTGATT 

ATTTTATATA GATGACGCGA 
AGTCGAAATC 

TTTGCTAATT TTACGATGGC 
TGCAAGCGAG 

ACTTTCCAAT TCTTCTAGGT 
TCCTTGATGT 

TCAAGATTTC CGTTTTTCCA 
GACTACCAGC 

TCCATTAACA GACTTTGAAC 
TAGTTGAATA 

CCGAAAATCT TCATAGGTAA 
TTGGAAATCT 

AGATCTTATT TTGGTATTTT 
GATATTGCTC 

AATCGATGTT CCTCTATTCC 
TAGGTCCCCT 

TGAAGAAAGA CAATATATGA 
ATAATTAAGT 

AATCTAGAGA ATGCCTCGTT 
ATAAGTCAAT 

AGCAAAGGCT GAGTTAGAAG 
CTATCCAGCC 

GGCGAGGTAT AGATTCGAAA 
CCTTATACGG 

GATTTTCAAG GGAGCTTAAA 
TATGACCAAG 

AGAATTTGAG CAGGTGTTTT 
GGAAGAAAAA 

GCAGAGGAAG ATTGCTGTCC 
AAGATTTTAA 

AACCAACTTA CCTTCATACG 
CAGCGCTAAA 

GTTATAGGTA AAAGTCTAGG 
TCCAAGTTCA 

TCTATGTTTG TTCTTAGTGA 
CTATATTGTG 

TGATGTGGCG TAAGAAAGCG 
TCACGTCTGT ' 



Fig. 3 



DNA Serotype 2 



WO 00/05378 



PCT/NL99/00460 



TGGGATCTAT GGAATGCAAG 
ATTTTCAGAA TATGAAATGA 
CGGACGTTCG TCAGCTTACT 
CCGCTTTATT GGATGACATA 
AGCCCCGGGA CTTCTTACCT 
ATGGTGTTCA ACGGAGTTTT 
CTTTTCTTCA AAAGTGAAAA 
TACTAAGCAG GTGAGTGGAG 
ATGCTTATGG ACCGATTTCT 
ATCGTGCGAC ACATAAGATT 
GTTGCTTTCG CAGATGGCGG 
GGTGTCAATG CTTCTGTGCA 
TAGCAATTAT GTGCGGTTGA 
AATTGATGTA TATAACGATC 
ATTTCCCTGT TGGACAAGTT 
GCTACTCTTT AACAGGGGGT 
GTGATTGCTG CCTTGATTAA 
ATCCTATCTG GATTGGAAGG 
GATTATGAGT TTAGTGAATA 
AGCATTGACA GGAACAGGAC 
GATCACAACT TTATATGATG 
TTCAGTCCGT ACTTGTTGAA 
ATCAAGAAGT AAATGCAATC 
AGAAATTTTT AATTCTCTTA 
GTCTACAGTA GTTTTTTAGT 
AGTCAAAATG TTGAAGCCGG 
GGGTACCTAT TTGGCAAAAG 
AGTAGCAACG GAATTGAATC 
TTTCTATTCC TGTTGATACT 
AAGCGGCACG TATTGCAAAT 
GTTGAGGTCA CCAAGGTAAG 
CCAACCACTC CAAATACAAA 
AGGTATCTTG GCAACAGGTC 
TCCTCAGGAC ATCGAAGAGG 
CAGATTCGAA GAAATTAAAA 
AAAGAGAGGG AGTAAATAAA 
AATATTCAGC TTAGCGGAGC 
GAAGGTAAGA GTACAACTGC 
AGGTTATAAG ACCGTCTTGG 
CAAGCCAATT ACAAAGATTA 
CAGACTTGTC TCAAGGATTA 
GAAAGGTTTC TCCCAACCCT 
AATCTACTTG CGACTCTTCG 
GGACTGGTAA TTGATGCAGC 
TGCAGTAGTA GAAGCAGGCA 
GGAACAAACA GGCACACCGT 
TTGCCACTGA GAAGTATAGT 
AACATAAGTT TGATAAGTAG 
TCATATTTGG TGTGGATGAC 
AAGCTTATCG TCAAGGTGTT 
AAAGGGATGT TTGAAACACC 
GCAGTAGCAG AAGTTTATCC 
GTATTATAGT AAAGATATCT 
CTCGTGCTAT ATTCTCTTGG 
TTCAAGAAGC AGTGAACGAA 
AGCGTTATGA TGCTCTGGCA 
GACAAGGGAT GCTACACTCA 
GAACGAGCAA AAGAATTTAA 
TTTAGTACAT TGTGTTGCTA 
GGAGGCGTAT CAGCTTGTAA 
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AAGTTGTAAA ATTTTCAACA 
GTATCCTTGT CCCAGCAAAT 
AGTATCCTTG CTCCAGCCGA 
TCCAAAATGG AATCTACTCA 
GACAGCATAT CAATCTATGT 
TACCAATATT TTAGAAAATG 
AAATATATAG TTTCAAAGTG 
ATAGCTTTAA TAT C TAT ATT 
ACGGTCTCTC GTTCAGATGT 
TTATTGACAA CTACTCCACG 
GCAAAATCAA TACGATAAAC 
CACCTTAGAA AATTTTTATG 
ACTTCATTTC CTTCCTTCAA 
AAGAATTTAC AAGTTTACAT 
CATTTAAACT CAGACCAAGC 
GACAATGACC GTGGTAAAAA 
AAAGATGAGT ACGCCAGAGA 
CTCAATTCAA ACGGATTTGA 
CCCAACTAGA ATCAGGAACA 
GCTCAGACTT ATCTTCTTAT 
GAAATTAACC AAGATAGTCT 
AAATAAAGAT TTTAGGAGAA 
GAAATCGATG TTTTATTCTT 
ACTGCAGTGT TGACTGCGGG 
GACACCTCAA TATGACTCCA 
TGCGGGCTTG ACTAACCAAG 
ACTATCGGGA AATTATCCTA 
TGAAAGAGAG TTTGAAAGAA 
CGTATCGTTT CTATTTCTGT 
AGCCTTCGCA CCTTTGCAGT 
CGATGTGACG ACACTTGAAG 
ACGAAATATC TTGCTTGGTT 
TTGTACTGGT TATGGAGGTT 
TAATGGGATT GACATTGCTA 
TAGGAGAACA ATATGGCGAT 
ACCGAGGAGT ATTTCAATGC 
AGATATTAAG GTTGTTGGTA 
GGCTAGTCTC GCTATTGCCT 
TGGATGCAGA TATCCGAAAT 
CAGGTTTGAC GGATTACCTA 
TGCGATACAG ATATTCCAAA 
ACTGCCCTTT TACAAAGTAA 
TCGCTATTAT GATTATGTTA 
TATCATTGCA CAAAAATGTG 
ATGTTAAGTG CTCATCTTTG 
TCTTAGGCGT TATCTTGAAC 
GAATACGGAA ATTACGGCAA 
GTATTAATAT GATTGATATC 
GGTCCCAAAA CTATTGAAGA 
CGCTATATCG TAGCGACATC 
AGAAAAAATC ATCATGATTA 
TGAAATACGA TTGTGCTATG 
TAAGCAAACT TGAAAAAAAG 
AGTTCAGTAC GGATACTCCT 
ATGACGCTAC TTGGGCTAAC 
TTTCAGTCAG AGAGAGTAGA 
GGTAAATAGT AACCATGTGT 
AAAACGTACT CGATATTTTT 
GCGATATGCA TAATTTATAT 
AAAAAGAGTA TGGTGAGGAT 

Fig. 3 cont. 



CGACTAAATT CAAATTCGAC 
AGTGATATTA 

ATACGACCAA GATAACATCA 
ACTAGCAACT 

TGAATGGCGA GAGTCAAGCG 
AAGATCCAGG 

ACTCAGACTG TTGAAACAGC 
AGTGGTATTG 

CAATATCATT ATGACTGTCA 
AG AT TC AT AC 

TAACACATGC TGGTATTTAC 
GGATTGACAT 

TTAATCGACT TGGTGGGTGG 
GGGAATTATC 

ATTAGGCTTC GTTCGAGAGC 
CCAGGAAAAA 

ATCTAAAAAA TTACCAGGCA 
GCTTAGAAAC 

CAATTTACAG TAGAGTCACA 
GCGATGCCTG 

GGAGCAATCA AAGGCAGCGA 
AATATGAACA 

ACTAAAAACA ATTTGGAGAA 
GTTGGCATTT 

CTACCCGTAT CTATGTAGTG 
AGTTACAAGC 

TCACAAGATG TATTGACACA 
AAAATATCAG 

GCGTGATGCG GATCCAAATG 
GCAAAAGGTT 

AAGCAGTCCC AGCGGAAGAA 
TATTAGCTGG 

TTGGATGACC GTGTAAAACG 
GGTATAGTAC 

GTTAGAAATT GCACGTACAA 
TATCCGTACC 

TTACCTCTGT TAAATCGAAT 
ATGCTCGTTC 

TCAGTCATGC CTGGTTTCTT 
GCAGGGACAA 

CTTGACCGTA ATTGAGTCAG 
GAATTTTGAA 

TCGTTGACTG TCCACCATTA 
ATGCGATGGT 

AAAAAAGTAA AAGAGCAGTT 
AAATATGATA 

AAAAGCCTAA TTTCTCAGAT 
CATTCGCATA 

GAGCCTGAGT TTGATAAGCG 
TCATAGACGA 

ACTTTCTTCA ACTTAAAGAG 
GTGCTGAATT 

AAAGTACCAA CACTTAATGG 
TGGAAAGAGA 

TCCCGTACTT GCCCATATAG 
AAAGCTAATT 

TGAAGCCTGC TTTAATTGGC 
TAGAGCAGGA 

AGTAGACCTC CGTTTATGAG 
AGAGCGAAGG 
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CTTTGTTCAA GAAAAATCCT 
ATAGTGGAGG AGCTATGAAT 
GCATTGTTTG ATATGATAGC 
GCTGATTTAA ATCGTTCTGG 
TGCATTTTTT ATATCTCGTA 
TGAAAAAACA TTTAACTATA 
TTTCATTTAT GTTAGAGAAT 
TAATAAACTT CGTTTTGGTA 
AAGGATAGCT TTCTATTTTC 
GAACTATGGG AAAATATGCA 
AAAAAATCTT GTTGCATTGG 
ACCGCTCTAT TATTCTGTTG 
TGGTCGACTA CGTCTTTATA 
CAGACTTTGA GTTGTTAGGT 
GGTTTTACTG TGTTGAAGAA 
TTTTCCACAA ATTTTTATAA 
AGATATACTT GGAGCAGTAG 
TCCAATTATT CGTAGAGATG 
TTGGACAGAA TGGACGCATA 
AGGTACGTAA GAAAGAATTA 
TTCAAAATGG ACAACGATCC 
TTAGATGAGT TACCACAATT 
AGTCGGTACC CGTCCGCCTA 
AAGATTGAGT TTTAAACCAG 
GAAGTGATAT CACAGATTTT 
GGACCATCTG GTCAGACATT 
TTGTTGAGAG AGGGAGGTCA 
GAGAACAGTT TATATTATTG 
GTTTCGAGAC TTTCGTAGAA 
TTGTTGCATG TACAAGAGAA 
GTTTTTGAAC ATAATGGAGC 
AAAGCCATTG TTTATGATAT 
CAAAGATAGA AATGATACCT 
CATTTATCTT TTTAAGAAGC 
TAAACCCAGA CGGTCATGAA 
GGAAATTTTC TGAGAGTTTG 
GATAGCAAAA ATATTGAAAA 
TCTTATATTG CTTATGGAAC 
AGATAGTGTA GTACGTGAGT 
GGTTGTTGGA CGATTTGTGC 
AGTTTATGAA ATCATATTCA 
CCTTTTATGA GAAATTGAAA 
ATAAAGTTTG TTGGAACAGT 
TTTGCTTATT TTCATGGTCA 
TGAAGCACTT TCTTCTACTA 
AGGGGAAGAA GGAGCGAAAT 
TTGACAGTTG TGAGCAATTA 
AACAAGTCAA AGAAAGATTT 
AAGTTGTTTA AAGGATAAGT 
TATGGGGCAG ATAAGGTTCT 
TGAATTTGAA GCGCATGTTA 
AGTTGGTGCG CAAGTTGAAG 
ATTTTAATCC AAAAGGGATT 
TTGCTCAATA TGCCATAGAA 
ACCGCTGTCT TAGAAGGCAT 
GTTCATGAGA TTATTGTCAA 
TTTAATGGGG CGTTTTGCTG 
AAAACAATCA CCTCATATCA 
GGGTAGATAA TAAAGTGTTT 
TTGACGAAGA GGCTCTTGTC 
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TTGTTGATAT TGAAAAATCA 
ATTGAAATAG GATATCGCCA 
AGTTACGATT TCTGCAATCT 
AATTTTTATC ATAATGATGG 
TGCCGGTTGA ATTTGAGTAT 
GTATAATATT TGTAATTTTT 
AATTTCGCAC TTTCAAGACG 
TACCTATTTA ACGTAATTAT 
GACAACCTAT CAAAAAAAGA 
AGTTTTATTT GAATCAGATA 
TAATTTTAGG TACAGAAATA 
AAGAAGCTAT AGGGTTTTCA 
AATTTACCAA GTGAATATTT 
ATTGATGTAG GCGTTGATAT 
TAAAAAAATC CAAATGCTAG 
GCCTAGTCAC ATCTGGATGA 
TCGGGTTAAT TATTAGTGGT 
GTGGGCCAGC CATTTTTGCT 
TTTACATTCT ACAAGTTTCG 
ATGGCTCAAA ACCAGATGCA 
TAGAATTACT CCAATTGGAC 
TTATAATGTT CTAATTGGAG 
CAGTTGATGA ATTTGAAAAA 
GGATTACAGG TCTTTGGCAA 
AATGAAGTCG TTAGGCTGGA 
AAGATTTTAT TGAAGACAGT 
GTAAGACTCC TTTAAAACAA 
GTTCAAAAGG AATACCAGCA 
AAATTAACTG AGTATCAGAA 
AATTCAGCAA AATCAGATAT 
AACATGTTTT AATATTGATG 
TATGGCTCTC AAGAAATCTA 
CTCCAATTTT CTACATTCTT 
AGATTGAATC AATTGGAGGT 
TGGCTACGTG AAAAGTGGAG 
ATGTTAAAAT ACGCTGATTT 
ATATATTCAT GAAGATTATC 
AGACTTAGAT AAATCACGCC 
GGTATAAGGA GAAGGAAATT 
CTGAAAATAA CTATGAAGTA 
AGAAAAGATT TTGTTTTGAT 
AAAGAAACAG GGTTCGATAA 
CTATAATCAG GAGCTGTTAA 
CGAGGTTGGA GGAACGAACC 
AACTAAATCT TCTTCTAGAT 
ACTGGAATAA AGATAATCTT 
TCACAAGAAC AAATTAATGA 
TCTTGGGATT TTATTGTTGA 
TATGAAAAAG ATTCTATATC 
CTTGGAACTT ATAAAAGGCT 
TCCTACCTAA TGATGGAGTC 
TTATTAACTA TCCAATTCTA 
TTTGACTACT TCATATCATA 
AATAAGGTTG ACATAATTCA 
TTATCTGAAG CGAAAACTCA 
ACCTAAATTC ATCTCTGATT 
ATAAGATTGT GACAGTTTCA 
AAGATGACCA AATCAGTGT A 
TATCAGTCCG ATGCTCGGTC 
ATTGGTATGG TCGGTCGAGT 



AGTACAGTAA CCTCATAGAA 
AACGAAATTG 

TAACAAGTCA TATACCAAAT 
TTCATTATTT 

AGAGGTAATC TGATAGAGTT 
CTTATGGCAG 

TGGTGCCGTG TATTTCACAT 
TAAGCAGTTT 

CGATTCTAAT TACAACGGCT 
TACTATTTCA 

GATAAAATTA ATTTACCATT 
ACAAGGGAAG 

TGACTTAAAG CAATTAGTTT 
TAATTCATTC 

GTGACCATAG CATCGTCACT 
AACGACTTTT 

ATAGTTTCTA TTTTGTTAAT 
CAGAAACGAG 

TTCGATGTTT GTTGATGCCG 
AGGTGGGATG 

ACTTCATACG AAAAACAAGT 
ATATGAGTCT 

TATACTCCTA GTCAAAAGAG 
GTGAGCGGAA 

CCTAACATAC ATTGATAATT 
GAAAGTTGTA 

AGAATAGTAG TAGGGGATAT 
AAGTATGGTG 

AGATAAATCA ATTAATTATT 
TACAGGAGAA 

TGCCAAATAT TGGTTCAGCA 
TTGAAATTGC 

GCTTGTCGGA TTGGTCCTTT 
CAACTTTTCG 

TTATCCCGTC CGACAGTATT 
ACTAATTTGT 

GAAAATATGC TCCTGAAACA 
TTTCTCCGAC 

TCAGAAAATG ATTACTATTT 
ATGATTCGAG 

AACGAATGTA GAGCATAATT 
AGATAAGCGT 

AATATATTCG TGAAAATGCA 
CATCTTTACT 

GTGGGCTTTA ATAGAGAAGT 
CACAGAGTTA 

TATGGATAGT TTATCAACAA 
TGAGTATGAG 

TCCATGCTGG AGCAGAATTA 
TAGATAAGAA 

CTAGTGCCAG CATTAAGAGA 
CGTAGGAAAT 

T CATC ACT AT TCTAAACAGA 
CAATAATACT 

AATTACCTTT GTTGTGGCAT 
CGATCAATTT 

CAGGCTGTGG CAAACCATAT 
ATCTACAATG 

TGTTCGAGAA AGATTTGACA 
CAATGCGTGG 



Fig. 3 cont. 



WO 00/05378 PCTYN L99/O046O 



6/59 

AAAGGACAAG GAGATTTTTT AGAAGCAGTT GCTCCTATAC TCGAACAGAA TCCAAAAGCT 
ATCGCCTTTA TAGCAGGAAG TGCTTTTGAA GGAGAAGAGT GGCGAGTAGT 
AGAATTAGAA AAGAAGATTT CTCAATTAAA GGTCTCTTCT CAAGTCAGAC GAATGGATTA 
TTATGCAAAT ACCACTGAAT TATATAATAT GTTTGATATT TTTGTACTTC 
CAAGTACTAA TCCAGACCCT CTACCAACGG TTGTACTAAA AGCAATGGCA TGCGGTAAAC 
CTGTTGTCGG TTACCGACAT GGTGGTGTTT GTGAGATGGT GAAAGAAGGT . 
GTTAACGGTT TCTTAGTCAC TCCGAACTCA CCGTTAAATT TATCAAAAGT AATTCTTCAG 
TTATCGGAAA ATATAAATCT CAGAAAAAAA ATTGGTAATA ATTCTATAGA 
ACGTCAAAAA GAACATTTTT CGTTAAAAAG CTATGTAAAA AATTTTTCGA AAGTCTACAC 
CTCCCTCAAA GTATACTGAT TGGCTGAAGT GAATGCTTTA GTATAGCGAT 
TTATCGTATT CTCATTCGAT AAAACAAATG TTCAGAAACA GTTATAAGTT ATTTCTAAAG 
GGCACCTCTA TAAACTCCCA AAATTGCGAA TTTGGAGTTA CGAAAGCCTT 
GTTAAATCAA CATTTTAAAT TTTAGAAAAT TAGTTTTTAG AGCTCCCCTA AAATAGAAGA 
TAACAGAAGG GAGCCTTCAA AAACTTCATT TTTAATTGGA TTGTAGAAAA 
ACTGTTAAAT CAATATTTAG ATTTTTAGGA GTTCAGTTTT TGGGGGGAGA GCTTAATAAT 
CTATGCACTA TATTTCGAAA AATATATGGT GTAAAATCAG AACTGATGGT 
CGTGGCAAAA AAGAGAATGA GGAATTTATG AAAATTATTT CTTTTACAAT GGTTAATAAC 
GAAAGTGAGA TAATAGAGTC ATTTATACGG TATAATTATA ACTTTATTGA 
CGAGATGGTC ATTATTGATA ATGGTTGTAC AGATAACACG ATGCAAATTA TTTTTAATTT 
GATTAAAGAG GGATATAAAA TATCCGTATA TGATGAGTCT TTAGAGGCAT 
ATAATCAGTA TCGACTTGAT AATAAATATC TAACGAAAAT AATTGCTGAA AAAAATCCAG 
ATTTGATAAT ACCTTTGGAT GCGGATGAAT TTTTAACAGC CG&TTCAAAT 
CCACGGAAAC TTTTGGAACA ACTGGACTTA GAAAAGATAC ATTATGTGAA TTGGCAATGG 
TTTGTTATGA CTAAAAAAGA TGATATTAAT GATTCGTTTA TACCACGTAG 
AATGCAATAT TGTTTTGAAA AACCTGTTTG GCATCATTCT GATGGTAAAC CAGTTACTAA 
ATGTATAATT TCCGCTAAGT ATTACAAAAA AATGAATTTA AAGCTATCGA 
TGGGACATCA CACTGTTTTT GGTAACCCAA ATGTAAGGAT AGAACATCAT AATGATTTGA 
AATTTGCACA TTATCGAGCT ATTAGCCAAG AGCAATTAAT TTATAAAACA 
ATTTGTTACA CTATTCGCGA TATTGCTACT ATGGAGAACA ATATCGAAAC AGCTCAAAGA 
ACAAATCAGA TGGCGCTCAT TGAATCTGGC GTGGATATGT GGGAAACGGC 
GAGAGAAGCC TCTTATTCAG GTTATGATTG TAATGTTATA CATGCACCAA TTGATTTAAG 
TTTTTGTAAA GAAAATATTG TAATAAAATA TAACGAACTA TCCAGAGAAA 
CAGTAGCAGA ACGCGTGATG AAAACGGGAA GAGAAATGGC TGTTCGTGCA TATAATGTGG 
AGCGAAAACA AAAAGAAAAG AAATTTCTAA AACCTATTAT ATTTGTATTA 
GATGGGTTAA AAGGAGATGA GTATATTCAT CCCAATCCAT CAAATCATTT GACGATCTTA 
ACTGAAATGT ATAACGTCAG AGGCTTACTT ACCGATAATC ACCAAATTAA 
ATTTCTCAAA GTTAATTATA GATTAATTAT AACTCCAGAT TTTGCTAAGT TTTTACCGCA 
TGAATTTATT GTTGTACCAG ATACCTTGGA TATAGAGCAA GTTAAAAGCC 
AGTATGTTGG TACAGGTGTA GACTTGTCAA AGATTATTTC TTTAAAAGAG TATCGAAAAG 
AGATAGGCTT TATTGGTAAT TTGTATGCGC TTTTAGGATT TGTTCCGAAT 
ATGCTCAATA GAATTTATCT ATATATTCAG AGAAACGGTA TTGCAAACAC TATTATAAAA 
ATCAAGTCGA GATTGTGAGA GTTGTTTACT TTTATTTGTA ATTTTAAAAG 
TAATGCAGGC AGATAGGAGA AAAACGTTTG GAAAAATGAG AATAAGAATT AATAATTTGT 
TTTTTGTTGC CATAGCGTTT ATGGGCATAA TTATTAGTAA TTCGCAAGTT 
GTTCTAGCGA TAGGCAAAGC TTCTGTGATT CAGTATCTAT CTTATTTAGT TTTGATTTTA 
TGTATAGTTA ATGATTTATT AAAAAATAAC AAACATATTG TAGTTTATAA 
ATTAGGGTAT TTGTTTCTTA TTATATTTTT ATT TACT ATC GGAATATGTC AGCAAATTCT 
TCCTATAACA ACTAAAATAT ATTTATCAAT TTCAATGATG ATTATTTCAG 
TTTTAGCAAC GTTGCCAATA AGTTTGATAA AAGATATTGA TGATTTTAGA CGGATTTCAA 
ATCATTTGTT ATTCGCTCTT TTTATAACTT CGATATTAGG AATAAAGATG 
GGGGCAACGA TGTTCACGGG GGCAGTAGAA GGTATCGGTT TTAGTCAGGG TTTTAATGGA 
GGATTGACGC ATAAGAACTT TTTTGGAATA ACTATTTTAA TGGGGTTCGT 
ATTAACTTAC TTGGCGTATA AGTATGGTTC CTATAAAAGA ACGGATCGTT TTATTTTAGG 
ATTAGAATTG TTTTTGATTC TTATTTCAAA CACACGCTCA GTTTATTTAA 
TACTATTGCT TTTTCTATTT CTTGTTAATC TTGACAAAAT CAAAATAGAA CAAAGACAAT 
GGAGTACGCT TAAATATATT TCCATGCTAT TTTGTGCTAT TTTTTTATAC 
TATTTCTTTG GTTTTTTAAT AACACATAGT GATTCTTACG CTCATCGCGT TAATGGTCTT 
ATTAATTTTT TTGAGTATTA TAGAAATGAT TGGTTCCATC TAATGTTTGG 
TGCAGCGGAT TTGGCATATG GGGATTTAAC TTTAGACTAT GCTATAAGGG TTAGACGCGT 
TTTAGGTTGG AATGGAACGC TTGAAATGCC CTTACTGAGT ATTATGTTAA 
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AAAATGGTTT TATCGGTCTG 
TAAGAATATT AAAAACAGAT 
ATCATTGTAG TCCTATCTGC 
ATGCCAATAT GTTTTTGTTT 
TATTAACAAA CAACTGCAAA 
TGGTAGAGCA TATGTTCTAT 
ATGATGATTT TTATGATAGC 
AGTCAGCATT ATTGTACCTA 
GTTTAGATAG CATTATTTCC 
GTTCTTCAGA TTCATCAACG 
GGTAGAATAA AACTTTTCCG 
ATCAAAAATA GCACAGCAAA 
TGTTGACGGC AACATTGTTG 
GTCGGGAGGG TTACTTGCTA 
TGCAAAAGTG TCAAATTGAT 
ATTTTCCCAA TCATTATATG 
CTTTATAAGA ATATATATAT 
TTATTATTTA ATCTAAATTA 
TAACAGAAAT CTTTATTTTG 
TGATGTTTTT ATTCAATTAG 
TTGTTAAAAT ATTTGGTGGA 
ATATTATTTA TTATAGCTTA 
CCAAAGAAAT TGCATATATT 
ATTAAACGAA CGTCCTCTGT 
TAATAATTTG TTTAAAATTT 
ACATTTCTAT CATCGTCCCA 
TGTATAAATA GCATTGTAAA 
GGTAGTACGG ATAATTCGGA 
TAGTCGCATT CGTTATTTTA 
CATAAGTCGC GCCAAGGGTG 
TTATTCATTC GGAGTTCATC 
TGGCAGTTGC TGGTTATGAT 
GCAGAGCCGC TTCCTACAAA 
CTAGAGGCGG ATGGTCATCG 
AAAAGAACTA TTTGAAGATT 
CACTTATCGC TTGCTCTATG 
GCTTGTACTA TTATGTTGAC 
GCTTCCATTG CCTACTGGAA 
AGTAGAGGAG ATAAAGAGCT 
TTGTTTTTAG GCAAATATAA 
TCTCCAAACG CTATTTAGAA 
ACTAATGAAT GCTTATTATT 
TCTTTCTGAA AACGGGGAAA 
CTCGGTAAGA ATGTTGTAAT 
AATCCAACAA ATAGTAGAAT 
GTTATTTTTA CACATCTGGA 
CTTTCCGTAT TTCGTTGACA 
AACGAATAAG TGGAATACAA 
GTGGTATAAA AGAAAGTATA 
TTTTATTGAG CTATTCGAGA 
GTTCTTCATC GCTCCGTTCA 
GTTGGGAGTT ACTATGTTCC 
AATTTTGTAT GTTCTTTTCG 
AAACTTTTCA GTGGATGCCA 
ATATATAGAC TAATATCACT 
AGCAGGATGT GCGTTCCAAG 
ATTATTGGAG CAATTCTGAT 
GGTTGGAAGT CTACTTCCTT 
ATTTTTTATG ATAAAGTATG 
TGCTTCTTAT CATATCTACT 
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GTAGGGTATG GGATTGTTTT 
AATATAAAAA CAATAGGAAA 
AACAGTAGAA AATTATATTG 
ATTAAATTCT ATATCTACTA 
CATAAATTGG CAGGAATAGA 
AGGTGGCAAG ATAAAGATAG 
AAAGCAAGTT ACGGCATAAA 
TTTTTAATAC GGAAAAGTAC 
CAATCGTATA CTAATCTAGA 
GATATATGTT TGGAATACGC 
GTTACCAAAT GGTGGTGTTT 
TTATATTATG TTTGTAGATT 
AGTCCTTATA CACCTGTTTA 
CTTTTGATGG AAATTATCAA 
TTGGAAGAGA TAAAAGAGGT 
AGCGGTATCT TTAATAGCCC 
AAACCAAGGT TTTGACACTG 
TTTAAAGAAT ATAAAAAAAG 
CCAGAAGAAG TTTACAAAGT 
AAAATTTAGA AGAAAAAACT 
CAATATGAAT TTTCTGTTTT 
TTAATGTTCA AAAATGGAGA 
TAAGTATTTA TACAATAGGC 
TTTTAAAAGA ATATGTAAAT 
TTTTAAATAC TTTAATTAGG 
ATTTACAATG TTGAACAATA 
TCAGACCTAC AAACATATAG 
AGAAATTTGT TTAGCATATG 
AAAAAGAGAA CGGCGGGCTA 
ACTACTTAGC TTTTATAGAC 
CAACGTTTAC ACGAAGCAAT 
AGGGTAGATG CTTCGGGGCA 
TCAGGCTGTT CTGAGCGGCA 
CTTTGTGGTG GCCTGGAATA 
TTCGATTTGA AAAGGGTAAG 
AGTTAGAAAA AGTTGCAATA 
CGAGAAAATA GTATCATAAC 
TTTCAAAATG AACGAATGGA 
CTTACTAGAG TGTTATCGTT 
TCATTGGTTG AGCAAACAGC 
TTGTATATAA ACAATTGAAG 
TGGTAGGGTG TCTTCATCTT 
GATAAAATTC AAGAAAGATT 
AAATGGTTGA AAGAAAAGGG 
AGCACTCTTT GATACGATTA 
TTGGTCTGTT GAGCAGCGTC 
TGGCTGTTCC AATTTTTCTG 
AACAAGAGAC GCTAAAGCTC 
AACATGCTTT GTCTCTATGC 
ACCATCTGAT AGGAGTAAAG 
TTTGTCCTGT GGCTACTTTC 
GTTGTTGATT CAGGTAGTTT 
AGAAAAATAA ATGGTTGGGC 
TATTTGCTAA CATGGCTGAA 
TCGTTATCTT TTTGTTCTAG 
GTAGATACTT TCATTGCGAC 
TTTTGTGAAT CATTCTATAG 
TCTATGCGTC CCATTTGCGT 
GACAGAAGAT TCCAGCAATA 
TGACCCAGAT GCTGTATTTT 



ATATAAACTT TATCGTAATG 
GTCTGTATTT 

TAAATTTAAG TTTTGTATTT 
TGGAATCAAC 

GTTTTGAGTT GCTATTAATT 
TATTTTTTAC 

AGGAATTAGA GGATGGAAAA 
TTAAGAGAGT 

GATTCTTTTG ATAGATGACG 
AGAGCAAGAT 

CAAACGCAAG GAATTACGGT 
CTGATGATAT 

AAAGAGAATG ATAGTGATTT 
GAATCTGAGC 

GCGAGACTTA GGAAATGAAA 
TTGTTGCAAA 

AACAGTGGTT AGGAGAGGAC 
TCCGCTATGT 

ACTACAAATA CGTTTAAATA 
TTTGATTTGT 

TAAAGAGACG CTACAGTGGC 
TGAATCGCTT 

ATTCTTTAGA TACTCTAAGT 
TAATTGTTGC 

GAAGAAAAAA ATAATGATTA 
TCTATCCAAG 

AGATTCTTCT GGTGAATGAC 
CGAAGAAAGA 

TCAGATGCCC GTAATTATGG 
TCAGATGATT 

TGAGAGAGAG AATGCCCTTG 
TTTCTTAACA 

GGAATGTTTG TAAAAAGCTG 
AACTCTATAA 

ATTCATGAAG ATGAATACTT 
GTTAAGGAGT 

TTCTAGTATG ACTGACCATC 
CTTCTATGAA 

CATTTTTAGC CTTTGCTGTT 
AAAAGAAGCT 

CAAAATAAGC GACTTGCTTT 
AATTTTAGTG 

GAGAAGAAGT GAAAGTAGTA 
GATTAAAATG 

AATGTATCAT GGTACTTTGT 
AATGGTTTAT 

TTGCTTTCTG CCTATTTTCG 
AAGTTCAGCA 

TATCGTGATG GCTGTTAATG 
CCTTTTTCAG 

TGGAGAATCG GGTCCAGGGA 
TTTTATTACC 

TTGCTTACTT GTTTTTTAGT 
CACGGCATAT 

GGCTTGGTTT TTTCTTTCAA 
CCTATTTGGG 

AGCCCTTCTC CTGGTTTTAT 
ATGCTATGCT 

CTGTTGTCAA AATTGGGAGT 
TCAGTAGTCG 
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CACCATTTTT AGCAGTGCAA 
CCTTTCTAAT TTGCCTGTTT 
TTTATGAGAG TACGTGGAAA 
TTATTAGCAG ATTCGCATGT 
TGACGAGAGA GGAGTGGTAT 
ATATTTATTA ACTGAACAAG 
CCCAGCACGA TCTGGTTCGA 
ACCCATGATT TTTCACACGA 
AGAAAGAAGA CATCTATGTC 
ATTCCCAAAA TTGCGAATTT 
CTTAAATTTT AGAAAATTAG 
TCAATGTATT GTTAAGACCC 
AATGAAGTCA GTACGCACTT 
TTCTGCAAGT CACCTCACCG 
ATGAATATGT ACTTACAGGG 
GAAAGAGTGA ATCAGTACAT 
GGTTACTTAT CCTTAAAGTC 
CCATACGATG TATCAGATCC 
GAGATAGTTT GATGTCCGTT 
AGCAGCATGT TCATCTAATC 
TTCTTTGATA GCTAGATCAA 
GGTGATCATT TTTCAAGATC 
ATCGGATTCC CTATTCTCTT 
GCGATTTGGA GTCAATTCAA 
TGGTATTTTA AACCAACATA 
AATGATCTGT CGCTCGTACA 
AGTTCCGAGA AAGCAATTAT 
GATATTTGGA GCAAGGGCGA 
GATTGCTATT ATTGACCCAG 
GGAGATTTAT GTAGCAGGTC 
ACATAAAACC GCACCCACGA 
TTCTGCCTCA AGGTATTCCG 
CGTTTTGATA TCGGTATGAC 
GAGAAAGTGT ATTTAAAGGA 
TTTGCGTGAG GGGATAGAAT 
CTCGTCTATA CGATTGGAAC 
CCTCCCCATC TATACTCGTG 
TTCGTGGGTG GGGCTAGTTG 
CTTTTGGCCC GGGATGGGTA 
TGGTCTCTTC TATCGCTTTC 
CTCAGTCAGC CCCTATCGCT 
TTGCAAAGTT TTATGAGTGT 
GCAGCGGCAG CAGTCCATGT 
TGCTTTATCT TTATTTCTCA 
GTGTAATGGC AAACTCGGCA 
ATAAGAAGAT TGGGCTTCAT 
AGTATATCGA TTCCTCTTAT 
AGAATCATGC TCGGCAAGAT 
TTTCGGCTAC ACACTTGCGT 
GTGTCCGTGG TATTTTGAGA 
GTTATGTCCG TTACTATCTG 
ACCCTGAATT AGCGATGTTG 
GGATTTATTC CC AT GAT TAT 
AATATCCAGT TTTATAGTGG 
TATAGCAGGT GTACTAAATA 
GTGCTGCTTT GCAACGACTG 
ATTTTGTTGC TAAGAAAAAG 
TAATTGCTCT TGTTGTCGTC 
TCAATCTGGA TTCGTTGGTC 
AGAAAGGAAT TAACAGTTGC 
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TTTAAGGTAT CTTCGTTGAA 
GGTGGCTATA TTTTCTACAA 
ACGATAATGA CTCATTTCAG 
TAATATTCCG ACAAAGAAAT 
CTGTTTCTAA ACCCCAGTAT 
GAGAATTTTT AAGAGAACTG 
AAGGCTTACC GAATAAAAAC 
TTGATGTGGC AATTGAATCA 
AGTACGGATT CAGAAATGTA 
GGAGTTACGA AAGCCTTGTT 
TTTTTAGAGG TCCCCAAGGG 
AAAGAACTAT CTACTTATCA 
TTTTACGAAT CTGGATTTTA 
TTACGGACTG GCGAACAGAT 
GGACTCAGAA AATGTTTTGC 
TATCGAAGCT GTACAGGGGT 
TGTATGTAGA AGGAGAAAAA 
TGATTTCCTT GTTAAAGATG 
GATATCATCG GGCATTTTCC 
GAGGGAGACG GAGCGTTCAT 
AAACAAAAGA ACGCCTTTCC 
ACCGTCAAGT CGGTCATTTT 
TTGGAGGATG GTTATAATTT 
TCATCTGTCT GGAAAAGACT 
TTTGATTGGT TCAAGTCTCT 
ATTTGACTAG GCTTATAAAC 
TTGATCAAGC ATCGCCAGAG 
TAGTAGCGGA TGAAGAGTCT 
CCCTTGTCTT GGGATTATCA 
TTGCCCCTTA TCGGGAAGAC 
GATGGGGTTG ATTATTCATT 
TTTGAGTTGT TCGAAATGGC 
CTATAGTTCG TCTGCTTTAG 
CACTTTTCCT CTTCTTTCAA 
AGGAGGATTC ATGTCTAAAA 
CATCCTCGTT CAGGGATTAG 
TCATTTCTCA GGAAGTATAT 
GTCTCTTTAT CGGTCTACAG 
CACTTCCGCG AGAAATTTGA 
TTTTTACCAA TTTTTGGGCT 
CCTATTTGGT TTGCCTGATT 
TGTGCAAGGA TTTTTTACGA 
GGACTTTACT CCTATCGGTA 
TCTTTTCGAT GGAGAATGAT 
ACGACTGGTG TTTTTGCTTG 
TTTCGAAAGG ACTATCTTCG 
TTTTCATGGA TTAGGTCATA 
GCTAACACTG TCAGATGTAG 
CTATCTTACA AATTGTGTTT 
AAAAGAGAGG TGCAGATAAA 
GCGATTGGCC TGTTTGTGAC 
TTAGGTGGAT CTGAGTATCG 
TGTCGGGGTG TTCTTTGTAT 
AAATACAAAG TTTTTGCCAA 
TTTCCGTCCA CTTTGTTTTG 
CTTCCTATCT GTTGTTGCTA 
TATGCTTACG ATGAAGTTGC 
TATACAGGCT TGATGACAGT 
ACTAGGAATA GCGGTTCTAG 
CCTCAATACA TTCAGGGAAA 

Fig. 3 cont. 



TTTGTGGAAC GGCTTGTTTA 
AGTGGATCTG 

ATTAGCAGAT GCCATTTCGT 
TCAAATAGGT 

CCCCCTTTAT TTTCAAAGCT 
TTTGTTTAAT 

ATGCTATTTT TGGACGGGAA 
GGTTGTTTTG 

TAAGGGGGGC ACCTCTATAA 
AAATCAACAT 

GATTTGCGAG ACAAGAGGCA 
TACTCCATCG 

TGAAGATTGT ATATTTGTTC 
AAAAGAAGCC 

ATTTCAATGA TGAAGGGCAA 
TATAAAAAGG 

TTGAGACGAA TTTATATTTG 
GACGTTGAGA 

AGATGTCAGG GAGCAACTGC 
TTGATCTATA 

TTGTTACAGA GCTATGACGA 
TTAAATAAAC 

TTTCAAGGAT AAAAGAGTGT 
CTTTTATCAA 

ATTGTCAATC CATTGAGGTC 
CCTTTGTAGA 

AAGGTGCAAG CGCTGCTGCA 
TCTCAAAAAC 

TGTGACCGAA GAGAGTTGTT 
TATACAATCT 

TCTGGGTAAG GCTGTGGTGC 
AGGTAATATC 

ATTTTTTAAA TTGTTTTGAA 
AAAATGATAT 

AATCAATAGT TGTCTCAGGT 
CCTTCATTAC 

GGGCAGTTTA GCTTGTATAA 
TTAGGTGGGG 

TGATTTCGTA TCCACCTTGA 
ATCTTTTCTC 

GGGTCGTTCC GCTTTACTTT 
CCTATTTAGT 

CTGAGCGCTG TTATCAACAC 
TTCATCGCTC 

TGTGTCCTTG TTGTTTTTCT 
GTATGGTTTA 

ATGTACTCAA TCAATTTGAC 
CCCTATACAG 

TCGAGCTTGA ATACGGTATG 
GATTTGCTCA 

TTTTGGATTT CTAACAATTT 
TTTCAGTATG 

TTCTTTATAG TTTTCCAGCC 
TTGGTACTTT 

ATACCGACAA AGAATTTATG 
GTCTTGCATT 

GATTTCAACA TTTGTTAAGG 
ATTTGTCGGT 

TCGTTTATGC CTACATTTTT 
AACGGTCTAA 
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ATAAGGGCAC CTCTATAAAC 
ATCAAACATT TTAAATTTTA 
AAACGTCCCA AATGAGAGGT 
CAAGTATTCT CTACCATGAA 
AATGTTTCTT AAAGGACGTA 
CTAAAAATTG GGTAGAAAAG 
TCGTGTTGAA GAGCAGGCTT 
GGCTATTTTT TCTAATCGGC 
TGTTCTTTTA CTACTTAAGG 
ACAGTCATTT TTAGAAAGGA 
GGTTGTAATC ACAACGGTGA 
TGTGGTGTGG ATGCCGTTAA 
TTCAAAATAC GCACCAAAGG 
GCTCGAAATG ACTCGTCGTT 
TGCGTGATTA CTGTCTTGAA 
CATTGGACTT CTTGATTAGC 
GGTGAGATTA CCAATCTTCC 
CTTTCAACTG GTATGGCTGT 
TTTGCAGGAA AATGGAACGA 
CCCTTACCCT GCTTTGAATT 
TTCCAAACTT AACAATTGGC 
CTGCAGCAAT GGGAGCTGAA 
GAAATGGAAG GACCAGATCA 
AAAGGAGTGA GG AT AGTGG A 
AGAAGAAGTT GAAGTACGAA 
AATTGCTAAA GGCGAAGTCT 
CAGGAAATGG AATTTCGCCA 
ATTTTGAGGA AGACCAAAAT 
TAAGCGGAGT AAGGATGAAA 
TTATGCGTCG CTTATTGAGC 
GATCTTGTAG TGACAGCCAT 
GAAGCGGACA AGCGTAGGAT 
TACGTCTAAG CAGACAATCG 
TTTTGAAGAA GTCCAGTATG 
AGATGCTACC AGTTGCCAAT 
GTGGTGAAAA AACCATGGGA 
ACCAAGATGA GTCACCTTCA 
CTAGGAGAAA ATCCAACCAT 
AATGTTTTAA AACAAGACTT 
TTTGCCGAGG ATTACTATGT 
TAACACAGCC GAAGAACAAA 
GTGTTTGATA ATTGGATCCA 
AATTGATGCA TGAATTTGTA 
CTCGTTATTA CCATTCCTTG 
TCTTCGTCAG GTTTGATTGA 
CGCCAATTTG GACGTTTGTC 
TAAGGAAGCG ATTGTTGGTG 
ATTTGAACAA CCTGATTCTG 
TTTTATCTGT ACAGGCCTCA 
ATGAAAAAAG TAGCCTTTCT 
TCCTTGGTTG GATAGAACTC 
TGACTATCGT GGCTATCCTG 
ATTTGGATGA TGGAAAAGTA 
AGGAAATCTT TGACTTGCTT 
ATCATTAGCG AGCAAGCCAA 
ATAGGTTTTT CAAGTTTTGT 
TATCATCAAT ACGGGTGCCA 
TACTCCAGGA GTGACCATAA 
ATATTGGAAG TGGTTCAACA 
GGGCAGGGAC AGTTGTTTTG 
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TCCCAAAATT GCGAATTTGG 
GAAAATTAGT TTTTAGAGGT 
GCTCATAAGA ATTGACCATC 
AATTGTGCTA TAATCAAGTA 
TGCGCCTCTG CTTATGCCAG 
CAGATTAAAC TTCCACCAAT 
TAGAAGCAAC AAGCCCTGAG 
TATCAGAAGT GAAGTAGCGA 
AAAACCAAGC TGCTCCCTCA 
AATAAAATGG TTTATATTAT 
TGTTCATCTA GCACGGAAAA 
ATTTCAGACA TTTAAGGCAG 
CCGAATACCA AAAAATTACA 
TGGAATTGAG CTTTGAAGAG 
AAGGGAGTTG ATGTGTTTTC 
AC AG AT AT GC CCGTTTATAA 
CTATTTGGAA AAAATTGGTC 
TATGGATGAA ATTCATCAAG 
CCGATATTTC GATTTTGCAT 
TGAATGTCTT GCATACCTTG 
TATTCAGACC ATAGTGTTGG 
TTGATTGAAA AGCACTTTAC 
TAAAGCGAGT GCTACTCCTG 
ACAATCTCTT GGTAAATTTG 
ATAAAATTGT AGCTAGAAAA 
TTACAGAAGA AAACATCACT 
ATGGAATGGT ACAAAGTCTT 
ATTTGCCATA GTGCTTTTGA 
AAAATTTGTT TTGTGACAGG 
TATCTACAGG ATGATCCAGA 
GCATCTAGAA GAAAAATATG 
TGTCAAGCGG ATTCCATTGC 
TCAAATCTTT AGCGACCTTG 
ACTTGGTGTT GATTCTGGGG 
GCTGCGTTGC TTTATAATAT 
AATTTTGATG AGTCGATTCG 
TCTGACATCA ACGGATGAAT 
GTACTGAACA TCGGAGCTAT 
TTTGACAAGA GAAGAGTTGG 
TGTACTCTTT CACCCTGTTA 
CGCAGGCCTT ATTAGATGCT 
ATTCGGATAC ACATGCCGAT 
AAACAAGACT CTGATTCTTA 
GTCAAGCATT CACAAGGTTT 
AGTGCCCTCA TTACAGGTTC 
AGGACCGAGT GTGGTACATG 
GTTTGGGGCA ATTACGTGAT 
CTTTACAAGG TTATCGAGCT 
ACCATGAAAG AGTTTTATGA 
AGGAGCGGGT ACCTTTTCAG 
GATATGAACT CATTGGATAT 
TATTTGGTCC CTTGCAAGAT 
GATGCTGTCT TCGTCACTAT 
GCCAAAGATC ATTATGATGC 
TATTTTTTCC CCAGATAGTA 
AGGAGCCGAT TCCTATGTCT 
TTGTGGAACA TCATACCACG 
ATGGCTTGTG CCGTATCGGA 
GTGATTCAAT GTATCGAGAT 
AAATCGTTGA CGGAGTCAGG 
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AGTTACGAAA GCCTTGTTAA 
CCCCATATAA 

ACTGCCATCT ACCCAAAGTT 
TAAAGAAGGG 

AAGTCATGAG GTAAATCTCC 
CTATTGAAGA 

ACTATTCGAA AGAAATCTAG 
TCTTTATTAG 

AGACTTTATG GGAGCGATTT 
TGCAGAAATT 

TGGTAGAAGT TGCCGTTGAT 
ATTTGTTGAT 

AC AG G AG AG T CAGATTCTGA 
TATCTTGATT 

GACACCTTTT GATGAGGAAT 
GATTCCATCT 

GTCAAGCTAA GAAAGTTATT 
CGGTGAAGAT 

TGTACAACCG AGTATCCAAC 
AAAAAAGAAT 

TTCAGAAGTA CCCATCGCTG 
TCTGGACAAT 

ATATCTTAGC AGCCTTGGTA 
AAAAAGAGCC 

TCTATTGTTG CCAAAAAAGC 
GTCAAAAGAC 

GGGGCAGGTG AGTGAGCAGG 
AAATCAAATG 

CTCTCGTGCC GAATATGGGA 
AATGGAGCTG 

GGATGACGGT CAAAGACATC 
ATTTGACGGA 

ACAGAGCAAC TCACGGTTCT 
GATCGCTATG 

TCCTATTTGC CATATTCATG 
CCATGCCATT 

TTAGAAATCG TGTCATTCAA 
GGGTGTTGAA 

CGATGGAACT TGGAATTGAT 
CCTTGGAGGA 

CTAAAAGAAG ATGGTAGCCA 
AAGATAATGG 

CATCTTTACT TCGCTTCCAA 
AATAGGGAAT 

CGACCTTAAA TATTGGAAAT 
TTGGAACTTC 

GTGATAGATT TTACCAATCC 
ATCAAGGAAT 

T AG AT AGGGG AGAAAGTTTG 
ATGGTGTCCT 

TTTGAAGATA AACCGATCAG 
GTCCTAACCT 

AGGTGACAAT GTCAAGCGCA 
TTTGTTCAAC ' 
TCAAGGGACG AGGGGTTTTC 
ATGACAATTG 

GTGGAGGCCC ATTGTAACAT 
GAAAGCACTT 

TGCACCTTAT ACAACATTGG 
GACCTATGTT , 
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GGTGTACCTG CTAGAAAGAT TAAATAGGTG AATTGATGGA ACCAATTTGT CTGATTCCTG 
CTCGGTCAGG ATCAAAAGGT TTACCAAATA AAAACATGTT ATTTTTAGAT 
GGTGTACCGA TGATTTTCCA TACCATTCGA GCTGCGATTG AGTCTGGATG TTTTAAGAAA 
GAAAATATAT ATGTCAGTAC TGATTCAGAG GTTTACAAGG AAATTTGTGA 
AACAACTGGG GTTCAAGTCC TCATGCGTCC AGCTGACTTG GCGACAGATT TTACAACCTC 
TTTTCAACTG AACGAACATT TTTTACAAGA TTTTTCTGAT GACCAAGTAT 
TTGTTCTCCT GCAAGTTACG TCCCCATTAA GATCGGGAAA ACATGTCAAG GAGGCGATGG 
AGTTATATGG GAAAGGTCAA GCTGACCACG TTGTTAGCTT TACCAAAGTC 
GATAAGTCTC CAACATTGTT TTCAACTTTA GACGAAAACG GATTCGCTAA GGATATTGCA 
GGATTAGGTG GCAGTTATCG TCGTCAAGAT GAGAAAACAC TCTACTATCC 
TAATGGAGCG ATTTATATTT CTTCTAAGCA GGCTTATTTA GCGGATAAAA CTTATTTTTC 
TGAAAAAACA GCGGCCTATG TGATGACGAA GGAAGATTCG ATT GAT GT AG 
ATGATCACTT TGATTTTACT GGTGTTATTG GTCGAATTTA CTTTGATTAC CAGCGTCGTG 
AGCAACAAAA CAAACCATTT TATAAAAGAG AGTTAAAGCG TTTATGTGAG 
CAACGAGTCC ATGATAGTCT TGTGATTGGC GATAGTCGTC TGTTAGCCTT GTTACTGGAT 
GGTTTCGATA ATATCAGCAT CGGTGGGATG ACAGCTTCGA CAGCACTTGA 
AAACCAAGGT CTCTTTTTGG CTACTCCGAT AAAGAAAGTT TTGCTTTCTC TTGGTGTGAA 
TGATTTGATT ACTGACTATC CCTTGCATAT GATTGAGGAT ACTATTCGCC 
AGCTGATGGA AAGTCTTGTT TCCAAAGCAG AGCAGGTTTT TGTGACGACG ATTGCCTACA 
CGCTGTTTCG TGATAGCGTT TCCAATGAAG AAATTGTGCA GCTGAATGAC 
GTTATTGTTC AGTCAGCAAG TGAACTGGGT ATTTCAGTGA TTGATCTAAA TGAAGTTGTT 
GAAAAAGAGG CGATGCTTGA CTATCAGTAT ACCAATGATG GATTGCATTT 
CAATCAGATT GGACAAGAGC GTGTGAATCA GCTGATTTTG ACAAGTTTGA CAAGATAATT 
TGGTGATAGA AGCTATTTCA GTGGCTAGAC TATGTTGGTA TGTGTTTTAG 
AGCCCAGGAA TAACATCTGT AGAGGATGCT AGCCTTGAGA ATTGACAACC ATTTAGTTGT 
TTTAATTATA TAAGGGGACC TCTAAAAACT CCCTAAATTT CCCAAAAATG 
AGATAATAGA ATAAAAAGTA ATGAGGAGAG CTGTCATGCA TTTATTCACA GACGATGAAA 
AAATCTTGTC AAAACTATCA GAGAAAGGCA ATCCCTTAGA ACGTTTGGAT 
GCCGTTATGG ATTGGAATAT CTTTCTTCCA TTGTTGTCAG AGTTATTCAG TCGTAAAGAT 
AAAGTCATCA GTCGTGGCGG TCGTCCTCAC CTAGACTATC TCATGATGTT 
CAAAGCGCTC TTGCTTCAAC GTCTTCATAA CCTATCTGAC GATGCCATGG AAT AT CAACT 
GCTGGATCGT ATATCTTTTC GTCGTTTTGT TGGTTGTCAT GAAGACACTG 
TTCCCGATGC GAAAACTATC TGGCTCTATC GTGAGAAATT AACCAAGTCA GGTCGTGAAA 
AGGAGTTGTT CGATTTGTTC TATGCCCATC TCACAGATGA AGGGGTGATT 
GCCCATTCAG GTCAGATTGT GGATGCTACC TTTGTCGAAT GCCCTAAACA ACGCAATTCA 
CGTGAGGACA ATCAGAAAAT CAAAACTTAT CGAAAATTAT GAGGTCACAA 
CAGCTAGTGT ACACGACTCC AATGTCCTAG CTCCTCTTTG TGATGCCAAT GAAGCGGTTT 
TTGATGACAG TGCTTATGTT GGAAAATCAG TACCAGAAGG TTGTCGCCAC 
CACACGATTC GTCGTGCTTT TAGAAATAAA CCGTTGACTG AGACTGATAA GGTCATTAAT 
CGACATATTA CCAAAGTCCG TTGTCGCGTT GAGCATGGTT TTGGCTTCAT 
TGAAACTAAC ATGAAAGGTA ACATCTGTCG AGCAATTGGG AAGGCACGAG CTGAAACCAA 
TGTGACCTTA ACCAACCTGC TCTACAATAT CTGTCGTTTT GAGCAAATCA 
AACGACTGGG ATTACCATCC GTGGGCTTAG TGCGCCCAAA AAATAGGAAA ATAAGCAAAA 
AGAGGCTGGG CAAAAACTAG TTTCTCACAA TAAAAAAACG GCTCTTTGTC 
AACTGTAGTG GGTAGACGAA AAGCTAACAC CTAGAGAGGA CGAAATTCGT TCTCTCATTT 
TTGATGTTTA AAGCGTAACC GCCTAATAAC AAGGTATCTA TCCAATCACA 
CATTCCTCCA TTATATAGTT AAATGAAACA AAAACAGTAC ATCTATGATA TAATGTATTT 
ATGGCATATT CATTAGATTT TCGTAAAAAA GTTCTCGCAT ACTGTGAGAA 
AACCGGCAGT ATTACTGAAG CATCAGCTAT TTTCCAAGTT TCACGTAACA CTATCTATCA 
ATGGCTAAAA TTAAAAGAGA AAACCGGCGA GCTTCATCAC CAAGTTAAAG 
GAACCAAGCC AAGAAAAGTG GATAGAGATA AATTAAAGAA TTATCTTGAA ACTCATCCAG 
ATGCTTATTT GACTGAAATA GCTTCTGAAT TTGACTGTCA TCCAACAGCT 
ATTCATTACC CCCTCAAAGC TATGGGATAT ACTCGAAAAA AAAGAGCTGT AC C TACT AT G 
AACAAGACCC TGAAAAAGTA GAACTGTTCC TTAAAGAATT GAATAACTTA 
AGCCACTTGA CTCCTGTTTA TAT TG AC GAG ACAGGGTTTG AGACATATTT TCATCGAAAA 
TATGGTCGCT CTTTGAAAGG TCAGTTGATA AAAGGTAAGG TCTCTGGAAG 
AAGATACCAG CGGATATCTT TAGTAGCAGG TCTCATAAAT GGTGCGCTTA TAGCCCCGAT 
GACATACAAA GATACTATGA CGAGTGGCTT TTTCGAAGCT T 

Fig. 3 cont. 
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SLDIDHMMEVMEASKSAAGSACPSPQAYQAAFEGAENIIWTITGGLSGSFNAARVARDM 
YIEEHPNVNIHLIDSLSASGEMDLLVHQINRLISAGLDFPQWEAITHYREHSKLLFVLA 
KVDNLVKNGRLSKLVGTVVGLLNIRMVGEASAEGKLELLQKARGHKKSVTAAFEEMKKAG 
YDGGRIVMAHRNNAKFFQQFSELVKASFPTAVIDEVATSGLCSFYAEEGGLLMGYEVKA 



Fig. 3 cont. 
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MKKYQVIIQDILTGIEEHRFKRGEKLPSIRQLREQYHCSKDTVQKAMLELKYQNKIYAVE 
KSGYYILEDRDFQDHTCRAQSYRLSRITYEDFRICLKESLIGRENYLFNYYHQQEGLAEL 
ISSVQSLLMDYHVYTKKDQLVITAGSQQALYILTQMETLAGKTEILIENPTYSRMIELIR 
HQGI PYQTIERNLDGI DLEELES I FQTGKIKFFYTI PRLHNPLGSTYDI ATKTAI VKLAK 
QYDVYIIEDDYLADFDSSHSLPLHYLDTDNRVIYIKSFTPTLFPALRIGAISLPNQLRDI 
FIKHKSLIDYDTNLIMQKALSLYIDNGMFARNTQHLHHIYHAQWNKIKDCLEKYALNIPY 
RIPKGSVTFQLSKGILSPSIQHMFGKCYYFSGQKADFLQIFFEQDFADKLEQFVRYLNE 



Fig. 3 cont. 
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MKI 1 1 PNAKEVNTNLENAS FYLLS DRS KPVLDAI SQFDVKKMAAF YKLNEAKAELEADRW 
YRIRTGQAKTYPAWQLYDGLMYRYMDRRGIDSKEENYLRDHVRVATALYGLIHPFEFISP 
HRLDFQGSLKIGNQSLKQYWRPYYDQEVGDDELILSLASSEFEQVFSPQIQKRLVKILFM 
EEKAGQLKVHSTISKKGRGRLLSWLAKNNIQELSDIQDFKVDGFEYCTSESTANQLTFXR 

SIKM 

Fig. 3 cont. ORF2X 
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MKKRSGRSKSSKFKLVNFALLGLYSITLCLFLVTMYRYNILDFRYLNYIVTLLLVGVAVL 
AGLLMWRKKARI FTALLLVFSLVITSVGIYGMQEWKFSTRLNSNSTFSEYEMSILVPAN 
SDITDVRQLTSILAPAEYDQDNITALLDDISKMESTQLATSPGTSYLTAYQSMLNGESQA 
MVFNGVFTNILENEDPGFSSKVKKIYSFKVTQTVETATKQVSGDSFNIYISGIDAYGPIS 
TVSRSDVNIIMTVNRATHKILLTTTPRDSYVAFADGGQNQYDKLTHAGIYGVNASVHTLE 
NFYGIDISNYVRLNFISFLQLIDLVGGIDVYNDQEFTSLHGNYHFPVGQVHLNSDQALGF 
VRERYSLTGGDNDRGKNQEKVIAALIKKMSTPENLKNYQAILSGLEGSIQTDLSLETIMS 
LVNTQLESGTQFTVESQALTGTGRSDLSSYAMPGSQLYMMEINQDSLEQSKAAIQSVLVE 

K 

Fig. 3 cont. CPS 2 A 
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MNNQEVNAIEIDVLFLLKTIWRKKFLILLTAVLTAGLAFVYSSFLVTPQYDSTTRIYWS 
QNVEAGAGLTNQELQAGTYLAKDYREIILSQDVLTQVATELNLKESLKEKISVSIPVDTR 
IVSISVRDADPNEAARIANSLRTFAVQKWEVTKVSDVTTLEEAVPAEEPTTPNTKRNIL 
LGLLAGG I LATGL VLVMEVLDDRVKRPQDI EE VMGLTLLG I VPDS KKLK 



Fig. 3 cont. 



CPS2B 
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MAMLEIARTKREGVNKTEEYFNAIRTNIQLSGADIKWGITSVKSNEGKSTTAASLAIAY 
ARSGYOTVLVDADIRNSVMPGFFKPITKITGLTDYLAGTTDLSQGLCDTDIPNLTVIESG 
KVSPNPTALLQSKNFENLLATLRRYYDYVIVDCPPLGLVIDAAIIAQKCDAMVAWEAGN 
VKCSSLKKVKEQLEQTGTPFLGVILNKYDIATEKYSEYGNYGKKA 



Fig. 3 cont. CPS2C 
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MIDIHSHIIFGVDDGPKTIEESLSLISEAYRQGVRYIVATSHRRKGMFETPEKIIMINFL 
QLKEAVAEVYPEIRLCYGAELYYSKDILSKLEKKKVPTLNGSCYILLEFSTDTPWKEIQE 
AVNEMTLLGLTPVLAHIERYDALAFQSERVEKLIDKGCYTQVNSNHVLKPALIGERAKEF 
KKRTRYFLEQDLVHCVASDMHNLYSRPPFMREAYQLVKKEYGEDRAKALFKKNPLLILKN 
QVQ 



Fig. 3 cont. 
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MNIEIGYRQTKLALFDMIAVTISAILTSHIPNADLNRSGIFIIMMVHYFAFFISRMPVEF 
E Y RGN L I E FEKT FN YS 1 1 FV I FLMA V S FMLENN FAL S RRG A V Y FT L I N FVL V Y L FN V 1 1 K 
QFKDSFLFSTTYQKKTILITTAELWENMQVLFESDILFQKNLVALVILGTEIDKINLPLP 
LYYSVEEAIGFSTREWDYVFINLPSEYFDLKQLVSDFELLGIDVGVDINSFGFTVLKNK 
KIQMLGDHSIVTFSTNFYKPSHIWMKRLLDILGAWGLIISGIVSILLIPIIRRDGGPAI 
FAQKRVGQNGRI FT FYKFRSMFVDAEVRKKELMAQNQMQGGMFKMDNDPR ITPIGHFIRK 
TSLDELPQFYNVLI GDMSLVGTRPPTVDEFEKYTPSQKRRLS FKPGI TGLWQVSGRS DI T 
D FNE WRLDLT Y I DNWT I WS DI KI LLKT VKWLLREGGQ 



Fig. 3 cont. 



CPS2E 
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MRTVYIIGSKGIPAKYGGFETFVEKLTEYQKDKSINYFVACTRENSAKSDITGEVFEHNG 
ATCFNIDVPNIGSAKAILYDIMALKKSIEIAKDRNDTSPIFYILACRIGPFIYLFKKQIE 
SIGGQLFVNPDGHEWLREKWSYPVRQYWKFSESLMLKYADLLICDSKNIEKYIHEDYRKY 
APETSYIAYGTDLDKSRLSPTDSWREWYKEKEISENDYYLWGRFVPENNYEVMIREFM 
KSYSRKDFVLITNVEHNSFYEKLKKETGFDKDKRIKFVGTVYNQELLKYIRENAFAYFHG 
HEVGGTNPSLLEALSSTKLNLLLDVGFNREVGEEGAKYWNKDNLHRVIDSCEQLSQEQIN 
DMDSLSTKQVKERFSWDFIVDEYEKLFKG 

Fig* 3 cont. CPS2F 
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MKKILYLHAGAELYGADKVLLELIKGLDKNEFEAHVILPNDGVLVPALREVGAQVEVINY 
PILRRKYFNPKGIFDYFISYHHYSKQIAQYA1ENKVDIIHNNTTAVLEGIYLKRKLKLPL 
LWHVHEIIVKPKFISDSINFLMGRFADKIVTVSQAVANHIKQSPHIKDDQISVIYNGVDN 
KVFYQSDARSVRERFDIDEEALVIGMVGRVNAWKGQGDFLEAVAPILEQNPKAIAFIAGS 
AFEGEEWRWELEKKISQLKVSSQVXRMDYYANTTELYNMFDIFVLPSTNPDPLPTWLK 
AMACGKPWGYRHGGVCEMVKEGVNGFLVTPNSPLNLSKVILQLSENINLRKKIGNNSIE 
RQKEHFSLKSYVKNFSKVYTSLKVY 



Fig. 3 cont. 
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MKIISFTMVNNESEIIESFIRYNYNFIDEMVIIDNGCTDNTMQIIFNLIKEGYKISVYDE 
SLEAYNQYRLDNKYLTKIIAEKNPDLIIPLDADEFLTADSNPRKLLEQLDLEKIHYVNWQ 
WFVMTKKDDINDSFIPRRMQYCFEKPVWHHSDGKPVTKCIISAKYYKKMNLKLSMGHHTV 
FGNPNVRI EHHNDLKFAHYRAI SQEQLI YKT I C YTI RDI ATMENN I ETAQRTNQMALIES 
GVDMWETAREASYSGYDCNVIHAPIDLSFCKENIVIKYNELSRETVAERVMKTGREMAVR 
AYNVERKQKEKKFLKPIIFVLDGLKGDEYIHPNPSNHLTILTEMYNVRGLLTDNHQIKFL 
KVNYRLIITPDFAKFLPHEFIWPDTXDIEQVKSQYVGTGVDLSKIISLKEYRKEIGFIG 
NL YALLG FVPNMLNRI YLY I QRNG I ANT 1 1 KI KS RL . 



Fig. 3 cont. 



CPS2H 



WO 00/05378 PCT/NL99/00460 



22/59 

MQADRRKTFGKMRIRINNLFFVAIAFMGIIISNSQWLAIGKASVIQYLSYLVLILCIVN 
DLLKNNKHIWYKLGYLFLIIFLFTIGICQQILPITTKIYLSISMMIISVLATLPISLIK 
DI DDFRRI SNHLLFALFI TS I LG I KMGATMFTGAVEG I G FSQG FNGGLTHKN FFG I T I LM 
GFVLTYLAYKYGSYKRTDRFILGLELFLILISNTRSVYLILLLFLFLVNLDKIKIEQRQW 
STLKY I SMLFCAI FLY YFFGFLI THS DS YAHRVNGLI NFFEYYRNDWFHLMFGAADLAYG 
DLTLDYAIRVRRVLGWNGTLEMPLLSIMLKNGFIGLVGYGIVLYKLYRNVRILKTDNIKT 
IGKSVFIIWLSATVENYIVNLSFVFMPICFCLLNSISTMESTINKQLQT 



Fig. 3 cont. 
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MEKVSIIVPIFNTEKYLRECLDSIISQSYTNLEILLIDDGSSDSSTDICLEYAEQDGRIK 
LFRLPNGGVSNARNYG I KNSTAN YIM FVDS DDI VDGN I VESLYTCLKENDS DLSGGLLAT 
FDGNYQESELQKCQIDLEEIKEVRDLGNENFPNHYMSGirNSPCCKLYKNIYINQGFDTE 
QWLGEDLLFNLNYLKNIKKVRYVNRNLYFARRSLQSTTNTFKYDVFIQLENLEEKTFDLF 
VKIFGGQYEFSVFKETLQWHIIYYSLLMFKNGDESLPKKLHIFKYLYNRHSLDTLSIKRT 
SSVFKRICKLIVANNLFKIFLNTLIREEKNND 
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MINISIIVPI YNVEQYLSKC 
YFKKENGGLS DARNYGISRA 
ALVAVAGYDR VDASGHFLTA 
EDFRFEKGKI HEDEYFTYRL 
DHRFHCLLEF QNERMDFYES 
FRIVYKQLKQ NKRLALLMNA 
SSTR 
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INSIVNQTYK HIEILLVNDG 
KGDYLAFIDS DDFIHSEFIQ 
EPLPTNQAVL SGRNVCKKLL 
LYELEKVAIV KECLYYYVDR 
RGDKELLLEC YRSFLAFAVL 
YYLVGCLHLN FSVFLKTGKD 



STDNSEEICL AYAKKDSRIR 
RLHEAIEREN 

EADGHRFWA WNKLYKKELF 
ENSIITSSMT 

FLGKYNHWLS KQQKKLLQTL 
KIQERLRRSE 



Fig. 3 cont. 
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MSKKSIWSG LVYTIGTILV QGLAFITLPI YTRVISQEVY GQFSLYNSWV GLVGLFIGLQ 
LGGAFGPGWV HFREKFDDFV STLMVSSIAF FLPIFGLSFL LSQPLSLLFG 
LPDWWPLIF LQSLMIWQG FFTTYLVQRQ QSMWTLPLSV LSAVINTALS LFLTFPMEND 
FIARVMANPA TTGVLACVSX WFSQKKNGLH FRKDYLRYGL SISIPLIFHG 
LGHNVLNQFD RIMLGKMLTL SDVALYSFGY TLASILQIVF SSLNTVWCPW YFEKKRGADK 
DLLSYVRYYL AIGLFVTFGF LTIYPELAML LGGSEYRFSM GFIPMIIVGV 
FFVFLYSFPA NIQFYSGNTK FLPIGTFIAG VLNISVHFVL IPTKNLWCCF ATTASYLLLL 
VLHYFVAKKK YAYDEVAIST FVKVIALWV YTGLMTVFVG SIWIRWSLGI 
AVLWYAYIF RKELTVALNT FREKRSK 

Fig. 3 cont. cp S20 
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MVYIIAEIGC NHNGDVHLAR. KMVEVAVDCG VDAVKFQTFK ADLLISKYAP KAEYQKITTG 
ESDSQLEMTR RLELSFEEYL DLRDYCLEKG VDVFSTPFDE ESLDFLISTD 
MPVYKIPSGE ITNLPYLEKI GRQAKKVILS TGMAVMDEIH QAVKILQENG TTDISILHCT 
TEYPTPYPAL NLNVLHTLKK EFPNLTIGYS DHSVGSEVPI AAAAMGAELI 
EKHFTLDNEM EGPDHKASAT PDILAALVKG VRIVEQSLGK FEKEPEEVEV RNKIVARKSI 
VAKKAIAKGE VFTEENITVK RPGNGISPME WYKVLGQVSE QDFEEDQNIC 
HSAFENQM 
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MKKICFVTGS RAEYGIMRRL 
KRIPLHLTDT SKQTIVKSLA 
ANAALLYNIP ICHIHGGEKT 



27/59 

LSYLQDDPEM ELDLWTAMH 
TLTEQLTVLF EEVQYDLVLI 
MGNFDESIRH AITKMSHLHL 



LEEKYGMTVK DIEADKRRIV 
LGDRYEMLPV 

TSTDEFRNRV IQLGENPTMY 



Fig. 3 cont. 
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MELGIDFAED YYWLFHPVT 

LMHEFVKQDS DSYIFTSLPT 

TLNIGNRQFG RLSGPSWHV 

LSVQASTMKE FYDR 
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LEDNTAEEQT QALLDALKED 
RYYHSLVKHS QGLIGNSSSG 
GTSKEAIVGG LGQLRDVIDF 



GSQCLIIGSN SDTHADKIME 
LIEVPSLQVP 

TNPFEQPDSA LQGYRAIKEF 
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MKKVAFLGAG TFSDGVLPWL 
DAVFVTIGDN VKRKEIFDLL 
IGFSSFVGAD SYVYDNCIIN 
VIQCIEIAPY TTLGAGTWL 
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DRTRYELIGY FEDKPISDYR 
AKDHYDALFN IISEQANIFS 
TGAIVEHHTT VEAHCNITPG 
KSLTESGTYV GVPARKIK 



GYPVFGPLQD VLTYLDDGKV 
PDSIKGRGVF 

VTINGLCRIG ESTYIGSGST 



Fig. 3 cont. 
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MEPICLIPAR SGSKGLPNKN MLFLDGVPMI FHTIRAAIES GCFKKENIYV STDSEVYKEI 
CETTGVQVLM RPADLATDFT TSFQLNEHFL QDFSDDQVFV LLQVTSPLRS 
GKHVKEAMEL YGKGQADHW SFTKVDKSPT LFSTLDENGF AKDIAGLGGS YRRQDEKTLY 
YPNGAIYISS KQAYLADKTY FSEKTAAYVM TKEDSIDVDD HFDFTGVIGR 
IYFDYQRREQ QNKPFYKREL KRLCEQRVHD SLVIGDSRLL ALLLDGFDNI SIGGMTASTA 
LENQGLFLAT PIKKVLLSLG VNDLITDYPL HMIEDTIRQL MESLVSKAEQ 
VFVTTIAYTL FRDSVSNEEI VQLNDVIVQS ASELGISVID LNEWEKEAM LDYQYTNDGL 
HFNQIGQERV NQLILTSLTR 



Fig. 3 cont. 
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ATCGCCAAAC GAAATTGGCA TTATTTGATA TGATAGCAGT TGCAATTTCT GCAATCTTAA CAAGTCATAT 
ACCAAATGCT GATTTAAATC GTTCTGGAAT TTTTATCATA 

ATGATGGTTC ATTATTTTGC ATTTTTTATA TCTCGTATGC CAGTTGAATT TGAGTATAGA GGTAATCTGA 
TAGAGTTTGA AAAAACATTT AACTATAGTA TAATATTTGC 

AATTTTTCTT ACGGCAGTAT CATTTTTGTT GGAGAATAAT TTCGCACTTT CAAGACGTGG TGCCGTGTAT 
TTCACATTAA TAAACTTCGT TTTGGTATAC CTATTTAACG 

TAATTATTAA GCAGTTTAAG GATAGCTTTC TATTTTCGA'C AATCTATCAA AAAAAGACGA TTCTAATTAC 
AACGGCTGAA CGATGGGAAA ATATGCAAGT TTTATTTGAA 

TCACATAAAC AAATTCAAAA AAATCTTGTT GCATTGGTAG TTTTAGGTAC AGAAATAGAT AAAATTAATT 
TATCATTACC GCTCTATTAT TCTGTGGAAG AAGCTATAGA 

GTTTTCAACA AGGGAAGTGG TCGACCACGT CTTTATAAAT CTACCAAGTG AGTTTTTAGA CGTAAAGCAA 
TTCGTTTCAG ATTTTGAGTT GTTAGGTATT GATGTAAGCG 

TTGATATTAA TTCATTCGGT TTTACTGCGT TGAAAAACAA AAAAATCCAA CTGCTAGGTG ACCATAGCAT 
TGTAACTTTT TCCACAAATT TTTATAAGCC TAGTCATATC 

ATGATGAAAC GACTTTTGGA TATACTCGGA GCGGTAGTCG GGTTAATTAT TTGTGGTATA GTTTCTATTT 
TGTTAGTTCC AATTATTCGT AGAGATGGTG GACCGGCTAT 

TTTTGCTCAG AAACGAGTTG GACAGAATGG ACGCATATTT ACATTCTACA AGTTTCGATC GATGTATGTT 
GATGCTGAGG AGCGCAAAAA AGACTTGCTC AGCCAAAACC 

AGATGCAAGG GTGGGTATGT TTTAAAATGG GAAAAACGAT CCTAGAATTA CTCCAATTGG ACATTTCATA 
CGCAAAAACA AGTTTAGACG AGTTACCACA GTTTTATAAT 

GTTTTAATTG GCGATATGAG TCTAGTTGGT ACACGTCCAC CTACAGTTGA TGAATTTGAA AAATATACTC 
CTGGTCAAAA GAGACGATTG AGTTTTAAAC CAGGGATTAC 

AGGTCTCTGG CAGGTTAGTG GTCGTAGTAA TATCACAGAC TTCGACGACG TAGTTCGGTT GGACTTAGCA 
TACATTGATA ATTGGACTAT CTGGTCAGAT ATTAAAATTT 

TATTAAAGAC AGTGAAAGTT GTATTGTTGA GAGAGGGAAG TAAGTAAAAG TATATGAAAG TTTGTTTGGT 
CGGTTCTTCA GGGGGACATT TGACTCACTT GTATTTGTTA 

AAACCGTTTT GGAAGGAAGA AGAACGTTTT TGGGTAACAT TTGATAAAGA GGATGCAAGA AGTCTTTTGA 
AGAATGAAAA AATGTATCCA TGTTACTTTC CAACAAATCG 

CAATCTCATT AATTTAGTGA AAAATACTTT CTTAGCTTTC AAAATTTTAC GTGATGAGAA ACCAGATGTT 
ATTATTTCAT CTGGTGCGGC CGTTGCTGTC CCCTTCTTTT 

ACATCGGAAA ACTATTTGGA GCAAAGACGA TTTATATTGA AGTATTTGAT CGAGTTAATA AATCTACATT 
AACTGGAAAA CTAGTTTATC CCGTAACAGA TATTTTTATT 

GTTCAGTGGG AAGAAATGAA GAAGGTATAT CCTAAATCTA TTAACTTGGG GAGTATTTTT TAATGATTTT 
TGTAACAGTA GGAACTCATG AACAACAGTT TAATCGATTG 

ATAAAAGAGA TTGATTTATT GAAAAAAAAT GGAAGTATM CCGACGAAAT ATT-TATTCAA ACAGGATATT 
CTGACTATAT TCCAGAATAT TGCAAGTATA AAAAATTTCT 

CAGTTACAAA GAAATGGAAC AATATATTAA CAAATCAGAA GTAGTTATTT GCCACGGAGG CCCCGCTACT 
TTTATGAATT CATTATCCAA AGGAAAAAAA CAATTATTGT 

TTCCTAGACA AAAAAAGTAT GGTGAACATG TAAATGATCA TCAAGTAGAG TTTGTAAGAA GAATTTTACA 
AGATAATAAT ATTTTATTTA TAGAAAATAT AGATGATTTG 

TTTGAAAAAA TTATTGAAGT TTCTAAGCAA ACTAACTTTA CATCAAATAA TAATTTTTTT TGTGAAAGAT 
TAAAACAAAT AGTTGAAAAA TTTAATGAGG ATCAAGAAAA 

TGAATAATAA AAAAGATGCA TATTTGATAA TGGCTTATCA TAATTTTTCT CAGATTTTAC TGGAGAGGGA 
T AC AG AT ATT ATCATCTTCT CTCAGGAGAA TGCACACCAT 

TAGTTCCTTC AGAATACCTG TATAATTATT TTAAATATTC TCAGGATTTA TATGTTGAAT TTACAAAAGA 
TGAGCAAAAA TATAAAGAAA ATAGGATATA TGAACGAGTT 

AAATGTTACA GATTATTTCC TAATATATCA GAAAAAACTA TTGATAATGT ACTGTTTAGA ATTTTATTAA 
GAATGTATCG AGCTTTTGAA TACTATTTAC AAAGATTGTT 

GTTTATTGAT AGAATAAAAA ACATGGTCTA AGAATAAGAT TTGGTTCTAA TTGGGTTTCG CTTCCACATG 
ATTTTGTGGC AATTCTTTTA TCAAATGAAA ACGAAACAGC 

TTATTTATTT AAGTAATCTA AATGTCCAGA TGAACTATTT ATACAGACAA TTATAGAAAA ATATGAATTT 
TCAAATAGAT TATCTAAATA TGGAAATTTA AGATATATAA 

AGTGGAAAAA ATCAACATCT TCTCCTATTG TCTTTACAGA TGATTCTATT GATGAATTGC TAAATGCAAG 
AAATTTAGGT TTTTTATTTG CTAGAAAGTT AAAAATAGAA 

AATAAATCTA AATTTAAAGA AATTATTACT AAAAAATAAA ATAGTTGATT TTGTGAGAGT AATGTATGTT 
TAAATTATTT AAATATGACC CGGAATATTT TATTTTTAAG 

TACTTCTGGT TGATTATTTT TATTCCAGAG CAAAAGTATG TATTTTTATT AATTTTTATG AATTTAATTT 
TATTTCATAT AAAATTTTTG AAAACTAAGC TAATATTAAA 

AAATGAAATT TTATTGTTTT TATTATGGTC TATATTATGT TTTGTTTCAG TAGTCACAAG TATGTTTGTT 
GAAATAAATT TTGAAAGATT ATTTGCAGAT TTTACTGCTC 

CCATAATTTG GATTATTGCA ATAATGTATT ATAATTTGTA TTCATTTATA AATATTGATT ATAAAAAATT 
AAAAAATAGT ATCTTTTTTA GTTTTTTAGT TTTATTAGGT 

ATATCTGCAT TGTATATTAT TCAAAATGGG AAAGATATTG TATTTTTAGA CAGACACCTT ATAGGACTAG 
ACTATCTTAT AACAGGCGTC AAAACAAGGT TGGTTGGCTT 

TATGAACTAT CCTACGTTAA ATACCACTAC AATTATAGTT TCAATTCCGT TAATCTTTGC ACTTATAAAA 
AATAAAATGC AACAATTTTT TTTCTTGTGT CTTGCTTTTA 
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TACCGATCTA TTTAAGTGGA TCGAGAATTG GTAGTTTATC GCTAGCAATA TTAATTATAT GCTTGTTATG 
GAGATATATA GGTGGAAAAT TTGCTTGGAT AAAAAAGCTA 

ATAGTAATAT TTGTAATACT ACTTATTATT TTAAATACTG AATTGCTTTA CCATGAAATT TTGGCTGTTT 
ATAATTCTAG AGAATCAAGT AACGAAGCTA GATTTATTAT 

TTATCAAGGA AGTATTGATA AAGTATTAGA AAACAATATT TTATTTGGAT ATGGAATATC CGAATATTCA 
GTTACGGGAA CTTGGCTCGG AAGTCATTCA GGCTATATAT 

CATTTTTTTA TAAATCAGGA ATAGTTGGGT TGATTTTACT GATGTTTTCT TTTTTTTATG TTATAAAAAA 
AAGTTATGGA GTTAATGGGG AAACAGCACT ATTTTATTTT 

ACATCATTAG CCATATTTTT CATATATGAA ACAATAGATC C GAT TAT TAT TATATTAGTA CTATTCTTTT 
CTTCAATAGG TATTTGGAAT AATATAAATT TTAAAAAGGA 

TATGGAGACA AAAAATGAAT GATTTAATTT CAGTTATTGT ACCAATTTAT AATGTCCAAG ATTATCTTGA 
TAAATGTATT AACAGTATTA TTAACCAAAC ATATACTAAT 

TTAGAGGTTA TTCTCGTAAA TGATGGAAGT ACTGATGATT CTGAGAAAAT TTGCTTAAAC TATATGAAGA 
ACGATGGAAG AATTAAATAT TACAAGAAAA TTAATGGCGG 

TCTAGCAGAT GCTCGAAATT TCGGACTAGA ACATGCAACA GGTAAATATA TTGCTTTTGT CGATTCTGAT 
GACTATATAG AAGTTGCAAT GTTCGAGAGA ATGCATGATA 

ATATAACTGA GTATAATGCC GATATAGCAG AGATAGATTT TTGTTTAGTA GACGAAAACG GGTATACAAA 
GAAAAAAAGA AATAGTAATT TTCATGTCTT AACGAGAGAA 

GAGACTGTAA AAGAATTTTT GTCAGGATCT AATATAGAAA ATAATGTTTG GTGCAAGCTT TATTCACGAG 
ATATTATAAA AGATATAAAA TTCCAAATTA ATAATAGAAG 

TATTGGTGAG GATTTGCTTT TTAATTTGGA GGTCTTGAAC AATGTAACAC GTGTAGTAGT T GAT ACT AG A 
GAATATTATT ATAATTATGT CATTCGTAAC AGTTCGCTTA 

TTAATCAGAA ATTCTCTATA AATAATATTG ATTTAGTCAC AAGATTGGAG AATTACCCCT TTAAGTTAAA 
AAGAGAGTTT AG T CAT TAT T TTGATGCAAA AGTTATTAAA 

GAGAAGGTTA AATGTTTAAA CAAAATGTAT TCAACAGATT GTTTGGATAA TGAGTTCTTG CCAATATTAG 
AGTCTTATCG AAAAGAAATA CGTAGATATC CATTTATTAA 

AGCGAAAAGA TATTTATCAA GAAAGCATTT AGTTACGTTG TATTTGATGA AATTTTCGCC TAAACTATAT 
GTAATGTTAT ATAAGAAATT TCAAAAGCAG TAGAGGTAAA 

AATGGATAAA ATTAGTGTTA TTGTTCCAGT TTATAATGTA GATAAATATT TAAGTAGTTG TATAGAAAGC 
ATTATTAATC AAAATTATAA AAATATAGAA ATATTATTGA 

TAGATGATGG CTCTGTAGAT GATTCTGCTA AAATATGCAA GGAATATGCA GAAAAAGATA AAAGAGTAAA 
AATTTTTTTC ACTAATCATA GTGGAGTATC AAATGCTAGA 

AATCATGGAA TAAAGCGGAG TACAGCTGAA TATATTATGT TTGTTGACTC TGATGATGTT GTTGATAGTA 
GATTAGTAGA AAAATTATAT TTTAATATTA TAAAAAGTAG 

AAGTGATTTA TCTGGTTGTT TGTACGCTAC TTTTTCAGAA AATATAAATA ATTTTGAAGT GAATAATCCA 
AATATTGATT TTGAAGCAAT TAATACCGTG CAGGACATGG 

GAGAAAAAAA TTTTATGAAT TTGTATATAA ATAATATTTT TTCTACTCCT GTTTGTAAAC TATATAAGAA 
AAGATACATA ACAGATCTTT TTCAAGAGAA TCAATGGTTA 

GGAGAAGATT TACTTTTTAA TCTGCATTAT TTAAAGAATA TAGATAGAGT TAGTTATTTG ACTGAACATC 
TTTATTTTTA TAGGAGAGGT ATACTAAGTA CAGTAAATTC 

TTTTAAAGAA GGTGTGTTTT TGCAATTGGA AAATTTGCAA AAACAAGTGA TAGTATTGTT TAAGCAAATA 
TATGGTGAGG ATTTTGACGT ATCAATTGTT AAAG AT AC T A 

TACGTTGGCA AGTATTTTAT TATAGCTTAC TAATGTTTAA ATACGGAAAA CAGTCTATTT TTGACAAATT 
TTTAATTTTT AGAAATCTTT ATAAAAAATA TTATTTTAAC 

TTGTTAAAAG TATCTAACAA AAATTCTTTG TCTAAAAATT TTTGTATAAG AATTGTTTCG AACAAAGTTT 
TTAAAAAAAT ATTATGGTTA TAATAGGAAG ATATCATGGA 

T ACT AT TAG T AAAATTTCTA TAATTGTACC TATATATAAT GTAGAAAAAT ATTTATCTAA ATGTATAGAT 
AGCATTGTAA ATCAGACCTA CAAACATATA GAGATTCTTC 

TGGTGAATGA CGGTAGTACG GATAATTCGG AAGAAATTTG TTTAGCATAT GCGAAGAAAG ATAGTCGCAT 
TCGTTATTTT AAAAAAGAGA ACGGCGGGCT ATCAGATGCC 

CGTAATTATG GCATAAGTCG CGCCAAGGGT GACTACTTAG CTTTTATAGA CTCAGATGAT TTTATTCATT 
CGGAGTTCAT CCAACGTTTA CACGAAGCAA TTGAGAGAGA 

GAATGCCCTT GTGGCAGTTG CTGGTTATGA TAGGGTAGAT GCTTCGGGGC ATTTCTTAAC AGCAGAGCCG 
CTTCCTACAA ATCAGGCTGT TCTGAGCGGC AGGAATGTTT 

GTAAAAAGCT GCTAGAGGCG GATGGTCATC GCTTTGTGGT GGCCTGTAAT AAACTCTATA AAAAAGAACT 
ATTTGAAGAT TTTCGATTTG AAAAGGGTAA GATTCATGAA 

GATGAATACT TCACTTATCG CTTGCTCTAT GAGTTAGAAA AAGTTGCAAT AGTTAAGGAG TGCTTGTACT 
ATTATGTTGA CCGAGAAAAT AGTATCACAA CTTCTAGCAT 

GACTGACCAT CGCTTCCATT GCCTACTGGA ATTTCAAAAT GAACGAATGG ACTTCTATGA AAGTAGAGGA 
G AT AAAG AG C TCTTACTAGA GTGTTATCGT TCATTTTTAG 

CCTTTGCTGT TTTGTTTTTA GGCAAATATA ATCATTGGTT GAGCAAACAG CAAAAGAAGC TT 
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RQTKLALFDM IAVAISAILT SHIPNADLNR 
EFEKTFNYSI I FAI FLTAVS FLLENNFALS 
IIKQFKDSFL FSTIYQKKTI LITTAERWEN 
SLPLYYSVEE AIEFSTREW DHVFINLPSE 
DINSFGFTAL KNKKIQLLGD HSIVTFSTNF 
LVPIIRRDGG PAI FAQKRVG QNGRIFTFYK 
MQGWVCFKMG KTILELLQLD ISYAKTSLDE 
GQKRRLSFKP GITGLWQVSG RSNITDFDDV 
LKTVKWLLR EGSK 



SGIFIIMMVH YFAFFISRMP VEFEYRGNLI 
RRGAVYFTLI NFVLVYLFNV 
MQVLFESHKQ IQKNLVALW LGTEIDKINL 
FLDVKQFVSD FELLGIDVSV 
YKPSHIMMKR LLDILGAWG LIICGIVSIL 
FRSMYVDAEE RKKDLLSQNQ 
LPQFYNVLIG DMSLVGTRPP TVDEFEKYTP 
VRLDLAYIDN WTIWSDIKIL 
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MKVCLVGSSG GHLTHLYLLK PFWKEEERFW VTFDKEDARS LLKNEKMYPC YFPTNRNLIN 
LVKNTFLAFK ILRDEKPDVI ISSGAAVAVP FFYIGKLFGA KTIYIEVFDR 
VNKSTLTGKL VYPVTDIFIV QWEEMKKVYP KSINLGSIF 
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MIFVTVGTHE QQFNRLIKEI DLLKKNGSIT DEIFIQTGYS DYIPEYCKYK KFLSYKEMEQ 
YINKSEWIC HGGPATFMNS LSKGKKQLLF PRQKKYGEHV NDHQVEFVRR 
ILQDNNILFI ENIDDLFEKI IEVSKQTNFT SNNNFFCERL KQIVEKFNED QENE 
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MFKLFKYDPE Yf^^KYFWLI IFIPEQKYVF LLIFMNLILF HIKFLKTKLl LKNEILLFLL 

WSILCFVSW TSMFVEINFE RLFADFTAPI IWIIAIMYYN LYSFINIDYK 

KLKNSIFFSF LVLLGISALY IIQNGKDIVF LDRHLIGLDY LITGVKTRLV GFMNYPTLNT 

TTIIVSIPLI FALIKNKMQQ FFFLCLAFIP IYLSGSRIGS LSPLAILIIC 

LLWRYIGGKF AWIKKLIVIF VILLIILNTE LLYHEILAVY NSRESSNEAR FIIYQGSIDK 

VLENNILFGY GISEYSVTGT WLGSHSGYI'S FFYKSGIVGL ILLMFSFFYV 

IKKSYGVNGE TALFYFTSLA IFFIYETIDP IIIILVLFFS SIGIWNNINF KKDMETKNE 
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MNDLISVIVP IYNVQDYLDK 
KYYKKINGGL ADARNFGLEH 
NADIAEIDFC LVDENGYTKK 
IKFQINNRSI GEDLLFNLEV 
SINNIDLVTR LENYPFKLKR 
EIRRYPFIKA KRYLSRKHLV 
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CINSIINQTY TNLEVILVND 
ATGKYIAFVD SDDYIEVAMF 
KRNSNFHVLT REETVKEFLS 
LNNVTRWVD TREYYYNYVI 
EFSHYFDAKV IKEKVKCLNK 
TLYLMKFSPK LYVMLYKKFQ 



GSTDDSEKIC LNYMKNDGRI 
ERMHDNITEY 

GSNIENNVWC KLYSRDIIKD 
RNSSLINQKF 

MYSTDCLDNE FLPILESYRK 
KQ 
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MDKISVIVPV YNVDKYLSSC IESIINQNYK NIEILLIDDG SVDDSAKICK EYEKDKRVKI 

FFTNHSGVSN ARNHGIKRST AEYIMFVDSD DWDSRLVEK LYFNIIKSRS 

DLSGCLYATF SENINNFEVN NPNIDFEAIN TVQDMGEKNF MNLXXNNIFS TPVCXLYQKR 

YITDLFQENQ WLGEDLLFNL HYLKNIDRVS YLTEHLYFYR RGILSTVNSF 

KEGVFLQLEN LQKQVIVLFK QIYGEDFDVS IVKDTIRWQV FYYSLLMFKY GKQSIFDKFL 

IFRNLYKKYY FNLLKVSNKN SLSKNFCIRI VSNKVFKKIL WL 
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MDTISKISII VPIYNVEKYL 

RIRYFKKENG GLSDARNYGI 

RENAL VAVAG YDRVDASGHF 

ELFEDFRFEK GKIHEDEYFT 

SMTDHRFHCL LEFQNERMDF 
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SKCIDSIVNQ TYKHIEILLV 
SRAKGDYLAF IDSDDFIHSE 
LTAEPLPTNQ AVLSGRNVCK 
YRLLYELEKV AIVKECLYYY 
YESRGDKELL LECYRSFLAF 



NDGSTDNS6B- ICLAYAKKDS 
FIQRLHEAIE 

KLLEADGHRF WACNKLYKK 
VDRENSITTS 
AVLFLGKYNH WLSKQQKK 
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AAGCTTATCG TCAAGGTGTT CGCTATATCG 
TTGAAACACC AGAAAAAGTT ATCATGACTA 
GCAGTAGCAG AAGTTTATCC TGAAATACGA 
AAAGATATAT TAAGCAAACT TGAAAAAAAG 
CTCGCGCTAT ATTCTTTTGG AGTTCAGTAG 
AGTGAACGAA GTGACGCTAC TTGGGCTAAC 
AACGATATGA CGCCCTAGCG TTTCATGCAG 
GCTATACTCA GGTAAATAGT AATCATGTGC 
GATCGAGCAA AAGAATTTAA AAAACGTACT 
TGTGTTGCTA GCGATATGCA TAATTTATCT 
GGAGGCTTAT AAGTTGCTAA CAGAGGAATT 
AAAGAATCCT CTTATGCTAT TAAAAAACCA 
CTAGATTGTG GAGAGAAAAA TGGATTTAGG 
CAGTAAACGA TTGATACTCG TGTGCATGGA 
CCATGATTTT GAGCAGACTG TTTTTGGATG 
TTCTTGCAGT TTTATTCGTA TCAATTTTAT 
TTAAAAGTCT TTTCATTAAT TACGCGTTAC 
CTTAGTTTAA TATCTGCGCA TTCATTGTTT 
GTGGCAGGCT TTTAGTTATC GTTTCATCTT 
CATTACTCCG AGGATTGTTT GGAAAGTCTT 
CTATCCGTAA GAAGGATAGC CCACTAAGAA 
ATATTTTTAT CAATACTGTC AAAGATCGAA 
GGTATCGTTG ATCGTGATCC AAATAAACTT 
GGAAACCGTA ATGATATTCC ACGACTGGTA 
AGTGACGATT GCCATCCCTT CTTTAAATGG 
TAACACTACA GGAGTGACCG TCAATAATAT 
TGGCGGGGAA CATGTCTGTC AGTGCCTTTC 
GACCAGAGGT TGTTTTGGAT CAGGATGAAT 
AAAACAATCC TTGTCACAGG AGCAGGTGGC 
GCTAAGTTTA CGCCTAAACG CTTGTTGTTG 
AATCTATCTC ATTCATCGAG AGTTACTGGA 
TCTCATTGCA GATATTCAAG ATAGAGAATT 
AATATCAACC CGATGTTGTT TATCATGCTG 
ATAATCCACA TGAAGCAGTG AAGAATAATA 
GCTGAGGCGG CTAAAACTGC AAAGGTTGCC 
GTTAATCCAC CAAATGTCAT GGGAGCGACT 
TGTTACAGGT TTAAACGAGC CAGGTCAGAC 
TCTAGGTAGT CGTGGAAGTG TTGTTCCGCT 
AAGGTGGACC TGTTACGGTT ACCGACTTTA 
AGGCAAGTCG TTTGGTTATC CAAGCTGGAC 
ATATTTGTCT TGGATATGGG CGAGCCAGTA 
TTGTTAAGTG GACACACAGA GGAAGAAATC 
CAGACCAGGC GAGAAACTCT ACGAGGAATT 
GATTCATGAA AAAATATTTG TGGGTCGCGT 
TTGTCAATTC ATTTATCAAT GGATTACTCC 
TGATTGAATT TGCAAAACAA GAATAAGAAA 
CCTAGAGTTT AAACGATGTT TAAGTTCTAG 
TTACTATTTA TTAAGAGTCA GATAATAGCA 
TTTATAATAA GTATATTTGG TCAAAAGGGA 
TTTTAGCAAT TATTATCTCA GGGATTGCTA 
TTATTATTGA TTGCATTGGC AATTAAATTA 
AAGCGGGTTG GTAAAAACAA GTCATACTTT 
TATGTACGTT GACGCACCAA GTGATATGCC 
GATTACCAAG GTGGGCGCGT TTCTCAGAAA 
CACAGCTTTT TAATATTTTT AAAGGTGAAA 
GGAATCAATA TGACTTAATT GAAGAGCGAG 
ATTCGTCCTG GACTAACCGG TTGGGCTCAA 
GAAAAGTCAA AATTAGATGG ATATTATGTT 
GG AT ATT AAA TGTTTCTTAG GTACATTCCT 
AGGTGGAACA GGGCAGAAAG GAAAAGGATG 
GGTCTATGAG AAAGAAAAAC CAGAGTTTCT 
TCAAACAATG ATTCCAACGG AGGTTGTCTT 
ATCAGAGCTT ATATAGTATT TTAGAAGAAT 
TAGCCTTGGA AAAGAATTCG GGTTTAGGAA 
AAACATTGTA ATTATGAGTG GGTTTGCACG 
ACACGTTTTG AAAAGCAAGT TAACTTTATA 



TGGCGACATC TCATAGACGA AAAGGGATGT 

ACTTTCTTCA ATTTAAAGAC 

TTGTGCTATG GTGCTGAATT GTATTATAGT 

AAAGTACCCA CACTTAATGG 

TGATACTCCT TGGAAAGAGA TTCAAGAAGC 

TCCCGTACTT GCCCATATAG 

AGAGAGTAGA AGAGTTAATT GACAAGGGAT 

TGAAGCCCAC TTTAATTGGT 

CGGTATTTTT TAGAGCAGGA TTTAGTACAT 

AGTAGACCTC CGTTTATGAG 

TGGCAAAGAT AAAGCGAAAG CGTTGCTAAA 

GGCGATTTAA ACTGGTTACT 

AACTGTTACT GATAAACTGT TAGAACGCAA 

TACGTGTCTT CTTATAGTTT 

TTATTATTGA CATACCAGAT GAACGCTTCA 

ATTTGATTCT ATCGTTTAGA 

ACAGGGTATC AGAGTTATGT AAAAATAGGA 

TTAATTATCT CAATGGTGTT 

AGTATCCTTA TTTTTGTCGT ATGTAATGCT 

ACATGAGACG AGAAAAAATG 

TCTTAGTAGT AGGTGCTGGA GATGGTGGTA 

AATTGAATTT TGAAATTGTC 

GG AACATTTA TCCGTACGGC TAAAGTTTTA 

GAGGAATTAG CTGTTGACCA 

TAAGGAGCGA GAGAAGATTG TTGAAATCTG 

GCCGAGTATT G AAG AC AT T A 

AGGAAATTGA CGTAGCAGAC CTTCTTGGTC 

TGAATCAGTT TTTCCAAGGG 

TCTATCGGTT CAGAGCTATG TCGTCAAATT 

CTTGGACATG GAGAAAATTC 
AAAGTACCAA GGTAAGATTG AGTTGGTCCC 

GATTTTTAGC ATAATGGCTG 

CAGCACATAA GCATGTTCCT TTGATGGAAT 

TTTTTGGAAC GAAGAATGTG 
AAATTTGTTA TGGTTTCAAC AGATAAAGCT 
AAACGTGTTG CAGAAATGAT 
TCAATTTGCG GCAGTCCGGT TTGGGAATGT 
ATTCAAAGAG CAAATTAGAA 
GG AT GACTCG TTATTTCATG ACGATTCCTG 
ATTTGGCAAA AGGTGGAGAA 
CAAATCCTGG AATTGGCAAG AAAAGTTATC 
GGGATTGTAG AATCTGGAAT 
ATTATCAACA GAAGAACGTG TCAGCGAACA 
TACAAATAAG CAGTCGGACA 
AAAAAGATAG AAATGAATTA AAAAATATGT 
GTAAAAAATA TTTTTACTTT 
GAAGGTTAGA ATACCTAATT AACAACAATA 
ACTAAGTGCT ACAAACTATC 
GATGTGAAAT GTATCCAATT TGTAAACGTA 
TTGTTGTTCT GAGTCCAATT 
GATTCTAAAG GTCCGGTATT ATTTAAACAA 
ATGATTTATA AATTCCGTTC 
GACTCATCTA TTAAAGGATC CTAAGGCGAT 
AACAAGTTTA GATGAACTGC 
TGGCGATTGT TGGTCCACGC CCAGCCTTAT 
ATAAATATGG TGCAAATGAT 
ATTAATGGTC GTGATGAATT GGAAATTGAT 
CAAAATATGA GTCTAGGTTT 
CAGTGTAGCC AGAAGCGAAG GTGTTGTTGA 
AAATTTTCAG TATTAATGTC 
TAGGGAATCT TTGGAAAGCA TCCTTGTCAA 
GGTAGAGGAT GGGCCACTCA 
TTAAAAGTCG ATTTTCATTT TTTAAAACGA 
TTGCACTGAA TGAAGGTTTG 
AAATGGATTC TGATGATGTT GCATATACAT 
AAACAAAACC CGACTATAGA 
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TATTGAGATA GATGAGTTCT TAAATTCTAC TAGTGAAATA GTTTCTCATA AAAATGTTCC 
AACCCAGCAC GATGAAATAT TAAAGATGGC AAGGCGGGAG AAATCCATGT 
GCCACATGAC TGTAATGTTT AAAAAGAAAA GTGTCGAGAG AGCAGGGGGG TATCAAACAC 
TTCCGTACGT AGAAGATTAT TTCCTTTGGG TGCGCATGAT TGCTTCAGGA 
TCGAAATTTG CAAACATTGA TGAAACACTA GTTCTTGCAC GTGTTGGAAA TGGGATGTTC 
AATAGGAGGG GGAACAGAGA ACAAATTAAC AGTTGGACAT TACTAATTGA 
ATTTATGTTA GCTCAAGGAA TTGTTACACC ACTAGATGTA TTTATTAATC AAATTTACAT 
TAGGGTCTTT GTTTATATGC CAACTTGGAT AAAGAAACTC ATTTATGGAA 
AAATCTTAAG GAAATAGTAT GATTACAGTA TTGATGGCTA CATATAATGG AAGCCCATTT 
ATAATAAAAC AGTTAGATTC AATTCGAAAT CAAAGTGTAT CAGCAGACAA 
AGTTATTATT TGGGATGATT GCTCGACAGA TGATACAATA AAAATAATAA AAGATTATAT 
AAAAAAATAT TCTTTGGATT CATGGGTTGT CTCTCAAAAT AAATCTAATC 
AGGGGCATTA TCAAACATTT ATAAATTTGA CAAAGTTAGT TCAGGAAGGA ATAGTCTTTT 
TTTCAGATCA AGATGATATT TGGGACTGTC ATAAAATTGA GACAATGCTT 
CCAATCTTTG ACAGAGAAAA TGTATCAATG GTGTTTTGCA AATCCAGATT GATTGATGAA 
AACGGAAATA TTATCAGTAG CCCAGATACT TCGGATAGAA TCAATACGTA 
CTCTCTAGA 
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AYRQGVRYIV ATSHRRKGMF ETPEKVIMTN FLQFKDAVAE VYPEIRLCYG AELYYSKDIL 
SKLEKKKVPT LNGSRYILLE FSSDTPWKEI QEAVNEVTLL GLTPVLAHIE 
RYDALAFHAE RVEELIDKGC YTQVNSNHVL KPTLIGDRAK EFKKRTRYFL EQDLVHCVAS 
DMHNLSSRPP FMREAYKLLT EEFGKDKAKA LLKKNPLMLL KNQAI 
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MDLGTVTDKL LERNSKRLIL 
YLILSFRLKV FSLITRYTGY 
RFILVSLFLS YVMLITPRIV. 
KLNFEIVGIV DRDPNKLGTF 
SLNGKEREKI VEICNTTGVT 
LNQFFQGKTI LVTGAGGSIG 
ELLEKYQGKI ELVPLIADIQ 
IFGTKNVAEA AKTAKVAKFV 
PGQTQFAAVR FGNVLGSRGS 
HLAKGGEIFV LDMGEPVQIL 
YEELLSTEER VSEQIHEKIF 
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VCMDTCLLIV SMILSRLFLD 
QSYVKIGLSL ISAHSLFLII 
WKVLHETRKN AIRKKDSPLR 
IRTAKVLGNR NDIPRLVEEL 
VNNMPSIEDI MAGNMSVSAF 
SELCRQIAKF TPKRLLLLGH 
DRELIFSIMA EYQPDWYHA 
MVSTDKAVNP PNVMGATKRV 
WPLFKEQIR KGGPVTVTDF 
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SIRNQSVSAD KVIIWDDCST 
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CTGCAGCACA TAAGCATGTT CCATTGATGG AATATAATCC ACATGAAGCA GTGAAGAATA 
ATATTTTTGG AACGAAGAAT GTGGCTGAGG CGGCTAAAAC TGCAAAGGTT 
GCCAAATTTG TTATGGTTTC AACAGATAAA GCTGTTAATC CGCCAAATGT CATGGGAGCG 
ACTAAACGTG TTGCAGAAAT GATTGTAACA GGTTTAAACG AGCCAGGTCA 
GACTCAATTT GCGGCAGTCC GTTTTGGGAA TGTTCTAGGT AGTCGTGGAA GTGTTGTTCC 
GCTATTCAAA GAGCAAATTA GAAAAGGTGG ACCTGTTACG GTTACCGACT 
TTAGGATGAC TCGTTATTTC ATGACGATTC CTGAGGCAAG TCGTTTGGTT ATCCAAGCTG 
GACATTTGGC AAAAGGTGGA GAAATCTTTG TCTTGGATAT GGGTGAGCCA 
GTACAAATCC TGGAATTGGC AAGAAAAGTT ATCTTGTTAA GCGGACATAC AGAGGAAGAA 
ATCGGGATTG TAGAATCTGG AATCAGACCA GGCGAGAAAC TCTACGAGGA 
ATTGTTATCA ACAGAAGAAC GTGTCAGCGA ACAGATTCAT GAAAAAATAT TTGTGGGTCG 
CGTTACAAAT AAGCAGTCGG ACATTGTCAA TTCATTTATC AATGGATTAC 
TCCAAAAAGA TAGAAATGAA TTAAAAGATA TGTTGATTGA ATTTGCAAAA CAAGAATAAG 
AAAGTAAAAA ATATTTTTAC TTTCCTAGAG TTTAAACGAT GTTTAAGTTC 
TAGGAAGGTT GGAATTGCTT TCGTGGAGGT GATAGATAGA AACCTATATA TTTGTAGAAG 
AAAGGATATT AAACTAAAGG TGAATCGGAA CATAAAGTTT AGATAGAGTT 
GGTATTTAAT GCCAAACAGG TGAATGCAAC CTCTCGCTCG TTACTAAGCA GGAGATAGTA 
AAGTTGCTTG AAAGAGAGTT TGTTAATCAG TATAAGTAGG CTAAAGTGAG 
AATATATATC T ATT AT TAT C GGTAATGATA CTATTATTGA GAATTATTGT AGTGGGGATA 
AAAATAATTT TTGGTGATTT TATCGTCCGA CTTAAAGGTG GGTTAAAAAA 
G TACT TAT AT TCTTTTAGAA TTGATGAAAA ATATGGGGGA ATATAATATT TATAGGAGAT 
ACGATGACTA GAGTAGAGTT GATTACTAGA GAATTTTTTA AGAAGAATGA 
AGCAACCAGT AAATATTTTC AGAAGATAGA ATCAAGAAGA GGTGAATTAT TTATTAAATT 
CTTTATGGAT AAGTTACTTG CGCTTATCCT ATTATTGCTA TTATCCCCAG 
TAATCATTAT ATTAGCTATT TGGATAAAAT TAGATAGTAA GGGGCCAATT TTTTATCGCC 
AAGAACGTGT TACGAGATAT GGTCGAATTT TTAGAATATT TAAGTTTAGA 
ACAATGATTT CTGATGCGGA TAAAGTCGGA AGTCTTGTCA CAGTCGGTCA AGATAATCGT 
ATTACGAAAG TCGGTCACAT TATCAGAAAA TATCGGCTGG ACGAAGTGCC 
CCAACTTTTT AATGTTTTAA TGGGGGATAT GAGCTTTGTA GGTGTAAGAC CAGAAGTACA 
AAAATATGTA AATCAGTATA CTGATGAAAT GTTTGCGACG TTACTTTTAC 
CTGCAGGAAT TACTTCACCA GCGAGTATTG CATATAAGGA TGAAGATATT GTTTTAGAAG 
AATATTGTTC TCAAGGCTAT AGTCCTGATG AAGCATATGT TCAAAAAGTA 
TTACCAGAAA AAATGAAGTA CAATTTGGAA TATATCAGAA ACTTTGGAAT TATTTCTGAT 
TTTAAAGTAA TGATTGATAC AGTAATTAAA GTAATAAAAT AGGAGATTAA 
AATGACAAAA AGACAAAATA TTCCATTTTC ACCACCAGAT ATTACCCAAG CTGAAATTGA 
TGAAGTTATT GACACACTAA AATCTGGTTG GATTACAACA GGACCAAAGA 
CAAAAGAGCT AGAACGTCGG CTATCAGTAT TTACAGGAAC CAATAAAACT GTGTGTTTAA 
ATTCTGCTAC TGCAGGATTG GAACTAGTCT TACGAATTCT TGGTGTTGGA 
CCCGGAGATG AAGTTATTGT TCCTGCTATG ACCTATACTG CCTCATGTAG TGTCATTACT 
CATGTAGGAG CAACTCCTGT GATGGTTGAT ATTCAAAAAA ACAGCTTTGA 
GATGGAATAT GATGCTTTGG AAAAAGCGAT TACTCCGAAA ACAAAAGTTA TCATTCCTGT 
TGATCTAGCT GGTATTCCTT GTGATTATGA TAAGATTTAT ACCATCGTAG 
AAAACAAACG CTCTTTGTAT GTTGCTTCTG ATAATAAATG GCAGAAACTT TTTGGGCGAG 
TTATTATCCT ATCTGATAGT GCACACTCAC TAGGTGCTAG TTATAAGGGA 
AAACCAGCGG GTTCCCTAGC AGATTTTACC TCATTTTCTT TCCATGCAGT TAAGAATTTT 
ACAACTGCTG AAGGAGGTAG TGTGACATGG AGATCACATC CTGATTTGGA 
TGACGAAGAG ATGTATAAAG AGTTTCAGAT TTACTCTCTT CATGGTCAGA CAAAGGATGC 
ATTAGCTAAG ACACAATTAG GGTCATGGGA ATATGACATT GTTATTCCTG 
GTTACAAGTG TAATATGACA GATATTATGG CAGGTATCGG TCTTGTGCAA TTAGAACGTT 
ACCCATCTTT GTTGAATCGT CGCAGAGAAA TCATTGAGAA ATACAATGCT 
GGCTTTGAGG GGACTTCGAT TAAGCCGTTG GTACACCTGA CGGAAGATAA ACAATCGTCT 
ATGCACTTGT ATATCACGCA TCTACAAGGC TATACTTTAG AACAACGAAA 
TGAAGTCATT CAAAAAATGG CTGAAGCAGG TATTGCGTGC AATGTTCACT ACAAACCATT 
ACCTCTTCTC ACAGCCTACA AGAATCTTGG TTTTGAAATG AAAGATTTTC 
CGAATGCCTA TCAGTATTTT GAAAATGAAG TTACACTGCC TCTTCATACC AACTTGAGTG 
ATGAAGATGT GGAGTATGTG ATAGAAATGT TTTTAAAAAT TGTTAGTAGA 
GATTAGTTAT TTTGGAAGGA GATATGGTGG AAAGAGATAT GGTGGAAAGA GACACGTTGG 
TATCTATAAT AATGCCCTCG TGGAATACAG CTAAGTATAT ATCTGAATCA 
ATCCAGTCAG TGTTGGACCA AACACACCAA AATTGGGAAC TTATAATCGT TGATGATTGT 
TCTAATGACG AAACTGAAAA AGTTGTTTCG CATTTCAAAG ATTCAAGAAT 
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MTKRQNIPFS PPDITQAEID EVIDTLKSGW ITTGPKTKEL ERRLSVFTGT NKTVCLNSAT 
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NTAKYISESI QSVLDQTHQN 
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ESSQSLRVLV SGPAIVTRKM 
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WELIIVDDCS NDETEKWSH 
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